VMware vSAN Stretched Cluster & 2 Node Guide

Introduction

VMware Virtual  SAN  6.1,  shipping  with  vSphere  6.0 Update 1, introduced a  new  feature called VMware  Virtual SAN Stretched  Cluster.  Virtual  SAN  Stretched  Cluster is a specific configuration implemented in environments where disaster/downtime  avoidance  is a key requirement.  This  guide was developed to provide additional insight and information for installation,  configuration  and  operation  of a Virtual  SAN  Stretched Cluster infrastructure  in conjunction  with VMware vSphere. This guide will explain how vSphere handles specific failure scenarios  and  discuss  various  esign  considerations  and  operational  procedures.

Virtual  SAN  Stretched  Clusters with Witness Host refers to a deployment where a user sets up a Virtual SAN cluster with 2 active/active  sites with  an identical number  of ESXi hosts distributed evenly between the two sites. The sites are connected via a high bandwidth/low  latency link.

The third site hosting the Virtual SAN Witness Host is connected to both of the active/active  data-sites.  This connectivity can be via low andwidth/high latency links.

Each site is configured as a Virtual SAN Fault Domain. The nomenclature  used to describe a Virtual  SAN  Stretched  Cluster  configuration  is  X+Y+Z,  where  X is the number  of ESXi hosts at data  site A, Y is the number of ESXi hosts at data site B, and Z is the number of witness hosts at site C. Data  sites are where virtual  machines  are  deployed.  The minimum supported configuration  is 1+1+1(3 nodes). The maximum  configuration  is 15+15+1
(31 nodes). In Virtual  SAN  Stretched  Clusters,  there is only one witness host in any configuration.

A virtual machine deployed on a Virtual  SAN  Stretched  Cluster will have one copy of its data on site A, a second copy of its data on site B    and any witness components  placed  on  the  witness  host in site C. This configuration is achieved through  fault domains alongside hosts  and  VM  groups,  and affinity rules. In the event of a complete site failure, there will be a full copy of the virtual  machine data as well as greater  than 50% of the components available. This will allow the virtual machine  to remain available on the Virtual SAN datastore.  If the virtual machine needs to be restarted on the other site, vSphere HA will handle this task.

vSphere Versions

Virtual  SAN  Stretched  Cluster  configurations  require  vSphere  6.0  Update  1 (U1) or greater.  This implies both vCenter Server 6.0 U 1 and ESXi  6.0 U1. This version  of  vSphere  includes  Virtual  SAN  version  6.1.  This  is  the  minimum version required for Virtual SAN Stretched Cluster  support.

vSphere & Virtual SAN

Virtual  SAN  version  6.1  introduced  features  including  both  All-Flash  and  Stretched  Cluster  functionality.  There  are  no  limitations  on  the  edition  of  vSphere  used  for  Virtual  SAN.  However,  for  Virtual SAN  Stretched  Cluster  functionality, vSphe re DRS is very desi rable . DRS will provide initial placement assistance, and will also  automatically migrate virtual machines to their co rrect site in accordance to Host/VM affinity rules. It can also help will locating virtual machines  to  their  co rrect  site  when  a  site  recovers  after  a  failure.  Otherwise  the administrator will have to manually carry out these tasks. Note that DRS is only available  in Enterprise  edition and  higher of vSphere.

Hybrid and All-Flash Support

Virtual SAN Stretched Cluster is supported on both hybrid configurations (hosts with local storage comprised of both magnetic  disks for capacity  and flash devices for cache) and  all-flash  configurations  (hosts with local storage made up of flash devices for capacity  and flash devices for cache).

On-disk Formats

VMware  supports  Virtual SAN Stretched Cluster with the v2 on-disk format only. The v1 on-disk  format  is based  on VMFS and  is the original on-disk  format  used for Virtual  SAN.  The v2 on-disk  format  is  the  version which comes  by default with Virtual SAN version 6.x. Customers  that upgraded from the original Virtual SAN 5.5 to Virtual SAN 6.0 may not have upgraded  the on-disk format for v1 to v2, and are thus still using v1. VMware ecommends upgrading the on-disk format to v2 for improved performance  and scalability,  as well as stretched  cluster  support.  In Virtual SAN 6.2 clusters,  the v3 on-disk  format  allows for additional features, discussed later, specific to 6.2.

Witness Host as an ESXi VM

Both physical ESXi hosts and virtual ESXi hosts (nested ESXi) are supportedfor the  witness host. VMware provides a Witness Appliance for those customers who wish to use the ESXi VM. A witness host/VM cannot  be shared between multiple Virtual  SAN  Stretched  Clusters.

Features Supported on VSAN but not VSAN Stretched

The following are a list of products and features support on Virtual SAN but not on a stretched cluster implementation of Virtual SAN.

    • SMP-FT, the new Fault Tolerant VM mechanism introduced in vSphere 6.0, is supported on standard VSAN 6.1 deployments, but it is not supported on stretched cluster VSAN deployments at this time. *The   exception to this rule, is when using 2 Node configurations in the same physical location.
    • The maximum value for NumberOfFailuresToTolerate in a Virtual SAN Stretched Cluster configuration is 1. This is the limit due to the maximum number of Fault Domains being 3.
    • In a Virtual SAN Stretched Cluster, there are only 3 Fault Domains. These are typically  referred to as the  Preferred, Secondary, and Witness Fault  Domains. Standard Virtual SAN configurations can be comprised of up to 32 Fault Domains.
    • The Erasure Coding feature introduced in Virtual SAN 6.2 requires 4 Fault Domains for  RAID5  type  protection and 6 Fault Domains for RAID6 type  protection. Because Stretched Cluster configurations only have 3 Fault Domains, Erasure Coding is not supported on Stretched Clusters at this time.

Features Supported on vMSC but not VSAN Stretched

The following  are a list of products and features support on vSphere Metro Storage Cluster (vMSC) but not on a stretched cluster implementation of Virtual SAN.

    • RR-FT, the original (and now deprecated) Fault Tolerant mechanism for virtual machines is supported on vSphere 5.5 for vMSC. It is not supported on Setched cluster Virtual SAN.
    • Note  that the new SMP-FT, introduced in vSphere 6.0 is not supported on either vMSC or stretched cluster VSAN, but does work on standard VSAN deployments.

Virtual SAN Stretched Clusters Versus Fault Domain

A common question is how stretched cluster differs from Fault Domains, which is a Virtual SAN feature that was introduced with Virtual SAN version 6.0.  Fault  domains enable what might be termed “rack awareness” where the components of virtual  machines could be distributed amongst multiple hosts in multiple racks, and should a rack failure event occur, the virtual machine would continue to be available. However, these racks would typically be hosted in the same data center, and if there was a data center wide event, fault domains would not be able to assist with virtual machines availability.

Stretched clusters essentially build on what fault domains did, and now provide what might be termed “data center awareness”. Virtual SAN  Stretched  Clusters can now provide availability for virtual machines even if a data center suffers a catastrophic outage.

The Witness Host

The witness host is a dedicated ESXi host (or appliance) whose purpose is to host  the  witness  component of virtual machines objects. The witness must have connection to both the master Virtual SAN node and the backup Virtual SAN node to join the cluster. In steady state operations, the master node resides in the “preferred site”; the backup node resides in the “secondary site”. Unless the witness host connects to both the master and the backup nodes, it will not join the Virtual SAN cluster.

Read Locality in Virtual SAN Stretched Cluster

In traditional Virtual SAN clusters, a virtual machine’s read operations are distributed across all replica copies of the data in the cluster.  In the case of a policy setting of NumberOfFailuresToTolerate=1, which results in two copies of the data, 50% of the reads will come from replica1 and 50% will come from replica2. In the case of a policy setting of Number Of Failures To Tolerate=2 in non-stretched  Virtual SAN clusters, results in three copies of the data, 33% of the reads will come from replica1, 33% of the reads will come from replica2 and 33% will come from replica3.

In a Virtual SAN Stretched Cluster, we wish to avoid increased latency caused by reading across the inter-site link. To  insure that 100% of reads, occur in the site the VM resides on, the read locality mechanism was introduced. Read locality overrides the NumberOfFailuresToTolerate=1 policy’s behavior to distribute reads across the two data sites.

DOM, the Distributed Object Manager in Virtual SAN, takes care of this. DOM is responsible for the creation of virtual machine storage objects in the Virtual SAN cluster. It is also responsible for providing distributed data access paths to these objects. There is a single DOM owner per object. There are 3 roles within DOM; Client, Owner and Component Manager. The DOM Owner coordinates access to the object, including reads, locking and object configuration and reconfiguration. All objects changes and writes also go through the owner. The DOM owner of an object will now take into account which fault domain the owner runs in a Virtual SAN Stretched Cluster configuration, and will read from the replica that is in the same domain.

There  is  now another consideration with this read locality. One must avoid unnecessary vMotion of the virtual machine between sites. Since the read cache blocks are stored on one site, if the VM moves around freely and ends up on the remote site, the cache will be cold on that site after the move. Now there will be sub-optimal  performance  until the cache is warm again. To avoid this situation,  soft affinity rules are used to keep the VM local to the same site/fault domain where possible. The steps to configure such rules will be shown  in detail in the vSphere DRS section of this guide.

Virtual SAN 6.2 introduced Client Cache, a mechanism that allocates 0.4% of host memory, up to 1GB, as an additional read cache tier. Virtual machines leverage the Client Cache of the host they are running on. Client Cache is not associated with Stretched Cluster read locality, and runs ndependently.

VMware vCenter Server

A Virtual SAN Stretched Cluster configuration can be created and managed by a single instance of VMware vCenter Server. Both the Windows version and the Virtual Appliance version (Linux) are supported for configuration and management of a Virtual SAN Stretched Cluster.

A Witness Host

Afficher l'image d'origine

In a Virtual SAN Stretched Cluster, the witness components are only ever placed on the witness host. Either a physical ESXi host or a special witness appliance provided by VMware, can be used as the witness host.

If a witness appliance is used for the witness host, it will not consume any of the customer’s vSphere licenses. A physical ESXi host that is used as a witness host will need to be licensed accordingly, as this can still be used to provision virtual machines should a customer choose to do so.

It is important that witness host is not added to the VSAN cluster. The witness host is selected during the creation of a Virtual SAN Stretched Cluster.

The witness appliance will have a unique identifier in the vSphere web client UI to assist with identifying that a host is in fact a witness appliance (ESXi in a VM). It is shown as a “blue” host, as highlighted below:

Note this is only visible when the appliance ESXi witness is deployed. If a physical host is used as the witness, then it does not change its appearance in the web client. A witness host is dedicated for each stretchd cluster.

Networking and Latency Requirements

When Virtual SAN is deployed in a stretched cluster across multiple sites using fault domains, there are certain networking requirements that must be adhered to.

Layer 2 and Layer 3 Support

https://i2.wp.com/blogs.vmware.com/virtualblocks/files/2016/01/Capture1.jpg?w=904

Both Layer 2 (same subnet) and Layer 3 (routed) configurations are  used in a recommended Virtual SAN Stretched Cluster deployment.

  • VMware recommends that Virtual SAN communication between the data sites be over stretched L2.
  • VMware recommends that Virtual SAN communication between the data sites and the witness site is routed over L3.

Note: A common question is whether L2 for Virtual SAN traffic across all sites is supported. There are some considerations with the use of a stretched
L2 domain between the data sites and the witness site, and these are discussed in further detail in the design considerations section of this guide. Another common question is whether L3 for VSAN traffic across all sites is supported. While this can work, it is not the VMware recommended  network  topology for Virtual SAN Stretched Clusters at this time.

Virtual SAN traffic between data sites is multicast. Witness traffic between a data site and the witness site is unicast.

Supported Geographical Distances

For VMware Virtual SAN Stretched Cluste rs, geographical distances are not a support concern. The key requirement is the actual latency numbers between sites.

Data Site to Data Site Network Latency

Data site to data site network refers to the communication between non-witness sites, in other words, sites that run virtual machines and hold virtual machine data. Latency or RTT (Round Trip Time) between sites hosting virtual machine objects should not be greater than 5msec (< 2.5msec one-way).

Data Site to Data Site Bandwidth

Bandwidth between sites hosting virtual machine objects will be workload dependent. For most workloads, VMware recommends a minimum of 10Gbps or greater bandwidth between sites. In use cases such as 2 Node configurations for Remote Office/Branch Office deployments, dedicated 1Gbps bandwidth can be sufficient with less than 10 Virtual Machines.

Please refer to the Design Considerations section of this guide for further details on how to determine bandwidth requirements.

Data Site to Witness Networklatency

This refers to the communication between non-witness  sites and the witness site.

In most Virtual SAN Stretched luster configurations, latency or RTT (Round Trip Time) between sites hosting VM objects and the witness nodes should not
be greater than 200msec (100msec one-way).

In typical 2 Node configurations, such as Remote Office/Branch Office deployments, this latency or RTT is supported up to 500msec (250msec one-way).

The latency to the witness is dependent on the number of objects in the cluster. VMware recommends that on Virtual SAN Stretched lusterconfigurations up to 10+10+1, a latency of less than or equal to 200 milliseconds is acceptable, although if possible, a latency of less than or equal to 100 milliseconds is preferred. For configurations that are greater than 10+10+1, VMware recommends a latency of less than or equal to 100 milliseconds is required.

Data Site to Witness Network Bandwidth

Bandwidth between sites hosting VM objects and the witness nodes are dependent on the number of objects residing on Virtual SAN. It is important to size data site to witness bandwidth appropriately for bo th availability and growth. A standard rule of thumb is 2Mbps for every 1000 objects on Virtual SAN.

Please refer to the Design Considerations section of this guide for further details on how to determine bandwidth requirements.

Inter-Site MTU Consistency

It is important to maintain a consistent MTU size between data nodes and the witness in a Stretched Cluster configuration. Ensuring that each VMkernel interface designated for Virtual SAN  traffic, is set to the same MTU size will prevent traffic fragmentation. The Virtual SAN Health Check checks for a uniform MTU size across the Virtual SAN data network, and reports on any inconsistencies.

Virtual Machines Per Host

The maximum  number of virtual machines per ESXi host is unaffected  by the Virtual SAN Stretched Cluster configuration. The maximum is the same as for normal VSAN deployments.

VMware recommends that customers should run their hosts at 50% of maximum  number of virtual machines supported in a standard Virtual SAN cluster to accommodate a full site failure. In the event of full site failures, the virtual machines on the failed site can be restarted  on the hosts in the surviving site.

Hosts Per Cluster

The minimum number of hosts in a Virtual SAN Stretched Cluster is 3. In such a configuration, site 1 will contain a single ESXi host,  site 2 will contain  a single ESXi host and then there is a witness host at the third site, the witness site. The nomenclature for such a configuration is 1+1+1. This is commonly  referred to as a 2 Node configuration.

The maximum number of hosts in a Virtual SAN Stretched Cluster is 31. Site 1 contains ESXi 15 hosts, site 2 contains 15 ESXi hosts, and the witness  host on the third site makes 31. This is referred to as a 15+15+1 configuration.

Witness Host

There is a maximum of 1 witness host per Virtual SAN Stretched Cluster. The witness host requirements are discussed in the design considerations section of this guide. VMware provides a fully supported witness virtual appliance, in Open Virtual Appliance (OVA) format, for customers who do not wish to dedicate a physical ESXi host as the witness. This OVA is essentially a pre-licensed ESXi host running in a virtual machine, and can be deployed on a physical ESXi host on the third site.

Number Of Failures to Tolerate

Because Virtual SAN Stretched Cluster configurations effectively have 3 fault domains, the Number Of Failures To Tolerate (FTT) policy setting, has a maximum of 1 for objects. Virtual SAN cannot comply with FTT values that are greater than 1 in a stretched cluster configuration.

Other policy settings are not impacted by deploying VSAN in a stretched cluster configuration and can be used as per a non-stretched  VSAN cluster.

Fault Domains

Fault domains play an important role in Virtual SAN Stretched Cluster. Similar to the Number Of Failures To Tolerate (FTT) policy setting discussed previously, the maximum number of fault domains in a Virtual SAN Stretched Cluster is 3. The first FD is the “preferred” data site, the second FD is the “secondary” data site and the third FD is the witness host site.

Laisser un commentaire

Votre adresse de messagerie ne sera pas publiée. Les champs obligatoires sont indiqués avec *