Installing and upgrading on cluster nodes
Planning and preparing for cluster installation
Before carrying out cluster installation, you need to plan hardware and network details.
Caution
- If you are using a shared storage device, before creating a cluster, when you turn on the computer and start the operating system, it is very important that only one node has access to the cluster disk. Otherwise, the cluster disks can become corrupted. To prevent the corruption of the cluster disks, shut down all but one cluster node, or use other techniques (for example, LUN masking, selective presentation, or zoning) to protect the cluster disks before creating the cluster. Once the Cluster service is running properly on one node, the other nodes can be installed and configured simultaneously. Each node of your cluster must be running Windows Server 2003, Enterprise Edition, or Windows Server 2003, Datacenter Edition.
Cluster hardware and drivers
Microsoft supports only complete server cluster systems that are compatible with the Windows Server 2003 family.
For cluster disks, you must use the NTFS file system and configure the disks as basic disks. You cannot configure cluster disks as dynamic disks, and you cannot use features of dynamic disks such as spanned volumes (volume sets).
Review the manufacturer's instructions carefully before you begin installing cluster hardware. Otherwise, the cluster storage could be corrupted. If your cluster hardware includes a SCSI bus, be sure to carefully review any instructions about termination of the SCSI bus and configuration of SCSI IDs.
To simplify configuration and eliminate potential compatibility problems, consider using identical hardware for all nodes.
Network adapters on the cluster nodes
In your planning, decide what kind of communication each network adapter will carry. The following list provides details about the types of communication that an adapter can carry:
- Only node-to-node communication (private network). This implies that the server has one or more additional adapters to carry other communication.
For node-to-node communication, you connect the network adapter to a private network that is used exclusively within the cluster. Note that if the private network uses a single hub or network switch, that piece of equipment becomes a potential point of failure in your cluster.
The nodes of a cluster must be on the same subnet but you can use virtual LAN (VLAN)
If your nodes use multiple private (node-to-node) networks, it is a best practice for the adapters for those networks to use static IP addresses instead of DHCP.
switches on the interconnects between two nodes. If you use a VLAN, the point to point, round-trip latency must be less than 1/2 second and the link between two nodes must appear as a single point-to-point connection from the perspective of the Windows operating system running on the nodes. To avoid single points of failure, use independent VLAN hardware for the different paths between the nodes. - Only client-to-cluster communication (public network). This implies that the server has one or more additional adapters to carry other communication.
- Both node-to-node and client-to-cluster communication (mixed network). When you have multiple network adapters per node, a network adapter that carries both kinds of communication can serve as a backup for other network adapters.
- Communication unrelated to the cluster. If a clustered node also provides services unrelated to the cluster, and there are enough adapters in the cluster node, you might want to use one adapter for carrying communication unrelated to the cluster.
The nodes of a cluster must be connected by two or more local area networks (LANs); at least two networks are required to prevent a single point of failure. A server cluster whose nodes are connected by only one network is not a supported configuration. The adapters, cables, hubs, and switches for each network must fail independently. This usually implies that the components of any two networks must be physically independent.
At least two networks must be configured to handle All communications (mixed network) or Internal cluster communications only (private network).
The recommended configuration for two adapters is to use one adapter for the private (node-to-node only) communication and the other adapter for mixed communication (node-to-node plus client-to-cluster communication).
Consider choosing a name for each connection that tells what it is intended for. The name makes it easier to identify the connection whenever you are configuring the server.
Notes
- If you use fault tolerant network adapters, create multiple private networks instead of a single fault-tolerant network.
- Do not use teaming network adapters on the private network.
- Do not configure a default gateway or DNS or WINS server on the private network adapters. Do not configure private network adapters to use name resolution servers on the public network; otherwise, a name resolution server on the public network might map a name to an IP address on the private network. If a client then received that IP address from the name resolution server, it might fail to reach the address because no route from the client to the private network address exists.
- Configure WINS and/or DNS servers on the public network adapters. If Network Name resources are used on the public networks, set up the DNS servers to support dynamic updates; otherwise the Network Name resources may not fail over correctly. Also, configure a default gateway on the public network adapters. If there are multiple public networks in the cluster, configure a default gateway on only one of these.
- The adapters on a given node must connect to networks in different subnets.
- When you use either the New Server Cluster Wizard or the Add Nodes Wizard to install clustering on a node that contains two network adapters, by default the wizard configures both of the network adapters for mixed network communications. As a best practice, reconfigure one adapter for private network communications only.
- --------------------------------------------------------------------------------------------
To change how the cluster uses a network
- Open Cluster Administrator.
- In the console tree, double-click to expand Cluster Configuration, and then click Networks.
- In the details pane, click the appropriate network.
- On the File menu, click Properties.
- Under Enable this network for cluster use, specify how you want the network to be used by the cluster:
- To use the network for communication with clients and between nodes, click All communications (mixed network).
- To use the network only for communications between nodes, click Internal cluster communications only (private network).
- To use the network only for communications with clients, click Client access only (public network).
- To use the network for communication with clients and between nodes, click All communications (mixed network).
- --------------------------------------------------------------------------------------------
- Manually configure the communication settings, such as Speed, Duplex Mode, Flow Control and Media Type of each cluster network adapter. Do not use automatic detection. You must configure all of the cluster network adapters to use the same communication settings.
- Confirm that your entire cluster solution is compatible with the products in the Windows Server 2003 family.
Cluster IP address
IP addressing for cluster nodes
Determine how to handle the IP addressing for the individual cluster nodes. Each network adapter on each node requires IP addressing. It is a best practice to assign each network adapter a static IP address. As an alternative, you can provide IP addressing through DHCP. If you use static IP addresses, set the addresses for each linked pair of network adapters (linked node-to-node) to be on the same subnet.
Note that if you use DHCP for the individual cluster nodes, it can act as a single point of failure. That is, if you set up your cluster nodes so that they depend on a DHCP server for their IP addresses, temporary failure of the DHCP server can mean temporary unavailability of the cluster nodes. When deciding whether to use DHCP, evaluate ways to ensure availability of DHCP services, and consider the possibility of using long leases for the cluster nodes. This helps to ensure that they always have a valid IP address.
Cluster name
Determine or obtain an appropriate name for the cluster. This is the name administrators will use for connections to the cluster. (The actual applications running on the cluster typically have different network names.) The cluster name must be different from the domain name, from all computer names on the domain, and from other cluster names on the domain.
Computer accounts and domain assignment for cluster nodes
Make sure that the cluster nodes all have computer accounts in the same domain. Cluster nodes cannot be in a workgroup.
Operator user account for installing and configuring the Cluster service
To install and configure the Cluster service, you must be using an account that is in the local Administrators group on each node. As you install and configure each node, if you are not using an account in the local Administrators group, you will be prompted to provide the logon credentials for such an account.
Cluster service user account
Create or obtain the Cluster service user account. This is the name and password under which the Cluster service will run. You need to supply this user name and password during cluster installation.
It is best if the Cluster service user account is an account not used for any other purpose. If you have multiple clusters, set up a unique Cluster service user account for each cluster. The account must be a domain account; it cannot be a local account. However, do not make this account a domain administrator account because it does not need domain administrator user rights.
As part of the cluster setup process, the Cluster service user account is added to the local Administrators group on each node. As well as being a member of the local administrators group, the Cluster service user account requires an additional set of user rights:
- Act as part of the operating system
- Back up files and directories
- Adjust memory quotas for a process
- Increase scheduling priority
- Log on as a service
- Restore files and directories
In addition, by default, the Cluster service account inherits the following user rights as a result of being a member of the local Administrators group:
- Manage auditing and security log
- Debug programs
- Impersonate a client after authentication
If your organization has removed these user rights from the default set of privileges assigned to the
local Administrators group, you need to specifically assign these user rights to the Cluster service account.
The preceding user rights are granted to the Cluster service user account as part of the cluster setup process. Be aware that the Cluster service user account will continue to have these user rights even after all nodes are evicted from the cluster. The risk that this presents is mitigated by the fact that these user rights are not granted domain wide, but rather only locally on each former node. However, remove this account from each evicted node if it is no longer needed.
Be sure to keep the password from expiring on the Cluster service user account (follow your organization's policies for password renewal).
Volume for important cluster configuration information (checkpoint and log files)
Plan on setting aside a volume on your cluster storage for holding important cluster configuration information. This information makes up the cluster quorum resource, which is needed when a cluster node stops functioning. The quorum resource provides node-independent storage of crucial data needed by the cluster.
The recommended minimum size for the volume is 500 MB. It is recommended that you do not store user data on any volume in the quorum resource. Do not use Shadow Copies for Shared Folders for the quorum resource. If you plan to put the quorum resource on a disk with multiple
NTFS partitions, ensure that all partitions on the disk are assigned drive letters.
Note
- When planning and carrying out disk configuration for the cluster disks, configure them as basic disks with all partitions formatted as NTFS (they can be either compressed or uncompressed). Partition and format all disks on the cluster storage device before adding the first node to your cluster. Do not configure them as dynamic disks, and do not use spanned volumes (volume sets) or Remote Storage on the cluster disks. Cluster disks on the cluster storage device must be partitioned as master boot record (MBR) and not as GUID partition table (GPT) disks.
The following section describes the physical installation of the cluster storage.
Beginning the installation of the cluster hardware
The steps you carry out when first physically connecting and installing the cluster hardware are crucial. Be sure to follow the hardware manufacturer's instructions for these initial steps.
Important
- Carefully review your network cables after connecting them. Make sure no cables are crossed
by mistake (for example, private network connected to public).
Initial steps to carry out in the BIOS or EFI when using a SCSI shared storage device
If you are using a SCSI shared storage device, when you first attach your cluster hardware (the shared bus and cluster storage), be sure to work only from the firmware configuration screens on the cluster nodes (a node is a server in a cluster). On a 32-bit computer, use the BIOS configuration screens. On an Itanium architecture-based computer, use the Extensible Firmware Interface (EFI) configuration screens. The instructions from your manufacturer describe whether these
configuration screens are displayed automatically or whether you must, after turning on the computer, press specific keys to access them. Follow the manufacturer's instructions for completing the BIOS or EFI configuration process. Remain in the BIOS or EFI configuration screens, and do not allow the operating system to start, during this initial installation phase. Complete the following steps while the cluster nodes are still displaying BIOS or EFI configuration screens, before starting the operating system on the first cluster node.
Important
- Make sure you understand and follow the manufacturer's instructions for termination of the SCSI bus.
- Make sure that each device on the shared bus (both SCSI controllers and hard disks) has a unique SCSI ID. If the SCSI controllers all have the same default ID (often it is SCSI ID 7), change one controller to a different SCSI ID, such as SCSI ID 6. If there is more than one disk that will be on the shared SCSI bus, each disk must also have a unique SCSI ID. In addition, make sure that the bus is not configured to reset SCSI IDs automatically on startup (otherwise the IDs will change from the settings you specify).
- Ensure that you can scan the bus and see the drives from all cluster nodes (while remaining in the BIOS or EFI configuration screens).
Initial steps to carry out in the BIOS or EFI when using a fibre channel shared storage device or no shared storage device
- Turn on a single node. Leave all other nodes turned off.
- During this initial installation phase, remain in the BIOS or Extensible Firmware Interface (EFI) configuration process, and do not allow the operating system to start. While viewing the BIOS
or EFI configuration screens, ensure that you can scan the bus and see the drives from the active cluster node. On a 32-bit computer, use the BIOS configuration screens. On an Itanium architecture-based computer, use the EFI configuration screens. Consult the instructions from your manufacturer to determine whether these configuration screens are displayed automatically or whether you must, after turning on the computer, press specific keys to access them. Follow the manufacturer's instructions for completing the BIOS or EFI configuration process.
Final steps to complete the installation
If you have not already installed Windows Server Enterprise Edition, or Windows Server Datacenter Edition, on the first cluster node, install it before proceeding. After you complete the BIOS or EFI configuration, start the operating system on one cluster node only and complete the configuration of the Cluster service using Cluster Administrator.
With the Cluster Administrator New Server Cluster Wizard, you can choose between Typical (full) configuration and Advanced (minimum) configuration options. Typical configuration is appropriate for most installations and results in a completely configured cluster. Use the Advanced configuration option only for clusters that have complex storage configurations that the New Server Cluster Wizard cannot validate or for configurations in which you do not want the cluster to manage all of the storage. The following examples describe each situation:
- In some complex storage solutions, such as a fiber channel switched fabric that contains several switches, a particular storage unit might have a different identity on each computer in the cluster. Although this is a valid storage configuration, it violates the storage validation heuristics in the New Server Cluster Wizard. If you have this type of storage solution, you might receive an error when you are trying to create a cluster using the Typical configuration option. If your storage configuration is set up correctly, you can disable the storage validation heuristics and avoid this error by restarting the New Server Cluster Wizard, selecting the Advanced configuration option instead.
- On particular nodes in a cluster, you may want to have some disks that are to be clustered and some disks that are to be kept private. The Typical configuration option configures all
disks as clustered disks and creates cluster resources for them all. However, with the Advanced configuration option, you can keep certain disks private because this configuration creates a cluster in which only the quorum disk is managed by the cluster (if you chose to use a physical disk as the quorum resource). After the cluster is created, you must then use Cluster Administrator to add any other disks that you want the cluster to manage.
Important
- If you are using a shared storage device: Before creating a cluster, when you turn the computer on and start the operating system, it is very important that only one node has access to the cluster disk. Otherwise, the cluster disks can become corrupted. To prevent the
corruption of the cluster disks, shut down all but one cluster node, or use other techniques (for example, LUN masking, selective presentation, or zoning) to protect the cluster disks before creating the cluster. Also, before starting the installation of the second and subsequent nodes, ensure that all disks that are to be managed by the cluster have disk resources associated with them. If these disks do not have disk resources associated with them at this time, the disk data will be corrupted because the disks will not be protected and multiple nodes will attempt to connect to them at the same time.
- Do not use Manage Your Server or the Configure Your Server Wizard to configure cluster nodes.
Quorum resource options when installing server clusters
With server clusters on the Windows Server operating systems, you can now choose between three ways to set up the quorum resource (the resource that maintains the definitive copy of the cluster configuration data and that must always be available for the cluster to run).
The first is a single node server cluster, which has been available in the past and continues to be supported. A single node cluster is often used for development and testing and can be configured
with, or without, external cluster storage devices. For single node clusters without an external cluster storage device, the local disk is configured as the cluster quorum device.
The second option is a single quorum device server cluster, which has also been available in earlier Windows versions. This model places the cluster configuration data on a shared cluster storage device that all nodes can access. The general topology is:
This is the most common model and is recommended for most situations. You might choose the single quorum device model if all of your cluster nodes are in the same location and you want to take advantage of the fact such a cluster continues supporting users even if only one node is running.
The third option, which is newer for Windows Server, is a "majority node set." A majority node set is a single quorum resource from a Server Cluster perspective; however, the cluster configuration data is actually stored on multiple disks across the cluster. The majority node set resource ensures that the cluster configuration data is kept consistent across the different disks. This allows cluster topologies as follows:
In the majority node set model, every node in the cluster uses a directory on its own local system disk to store the cluster configuration data. If the configuration of the cluster changes, that change is reflected across the different disks. Be aware that it is also possible to have shared storage devices in a majority node set cluster. The exact configuration depends on your installation's requirements.
Only use a majority node set cluster in targeted scenarios, such as:
- Geographically dispersed cluster: A cluster that spans multiple sites.
- Eliminating single points of failure: Although when using a single cluster storage device the quorum disk itself can be made highly available via RAID, the controller port or the Host Bus Adapters (HBA) itself may be a single point of failure.
- Clusters with no shared disks: There are some specialized configurations that need tightly consistent cluster features without having shared disks.
- Clusters that host applications that can fail over, but where there is some other, application-specific way, to replicate or mirror data between nodes.
Do not configure your cluster as a majority node set cluster unless it is part of a cluster solution offered by your Original Equipment Manufacturer (OEM), Independent Software Vendor (ISV), or Independent Hardware Vendor (IHV).
Cluster model considerations
Before implementing your cluster, consider what type of quorum resource solution you plan to use. Take into consideration the following differences between single quorum device clusters and majority node set:
Note
- The following information is presented to help you make basic decisions about the placement and management of your cluster nodes and quorum resource. It does not provide all the details about the requirements for each cluster model, or how each model handles failover situations.
Node failover behavior
The failover behavior of the majority node set is significantly different from the behavior of the single quorum device model:
- Using the single quorum device model, you can maintain cluster availability with only a single operational node.
- If you use a majority node set, more than half, or (Number of nodes configured in the cluster/2) + 1 nodes must be operational in order to maintain cluster availability. The following table shows the number of node failures that a given majority node set cluster can tolerate yet continue to operate:
Number of nodes configured in the cluster Number of node failures allowed before cluster failure Number of nodes needed to continue cluster operations 1
0
1
2
0
2
3
1
2
4
1
3
5
2
3
6
2
4
7
3
4
8
3
5
Geographic Considerations
You would commonly use a single quorum resource model if all nodes in your cluster will be in the same geographical location. As part of this requirement, your nodes must be connected to the same physical storage device.
A majority node set on the other hand would typically be appropriate if you have geographically dispersed nodes. The cluster configuration data is stored locally on each node on a file share that is shared out to the other nodes on the network. However, those shares must always be accessible or nodes can fail.
There are other specific requirements for geographically dispersed clusters, including the requirement that round-trip latency of the network between cluster nodes be a maximum of 500 milliseconds.
Hardware
Microsoft supports only complete server cluster systems that are compatible with the Windows Server family of products. Confirm that your entire cluster solution is compatible with products in the Windows Server family.