Storage in a hyper-converged infrastructure (HCI) requires compute resources that have been traditionally offloaded to dedicated storage arrays. Most HCI solutions require the deployment of storage virtual appliances to some or all of the hosts in the cluster to provide storage services to each host. These virtual appliances typically require CPU and/or memory reservations to avoid resource contention, which can result in performance degradation. Running a virtual appliance on every host in the cluster reduces the overall amount of compute resources available to run regular virtual machine workloads. Consolidation ratios will likely be lower and total cost of ownership rises when these storage virtual appliances are present and competing for the same resources as regular virtual machine workloads.
Storage virtual appliances can also introduce additional latency, which negatively affects performance. This is due to the number of steps required to handle and replicate write operations as shown in the figure below.
Virtual SAN does not require the deployment of storage virtual appliances or the installation of a vSphere Installation Bundle (VIB) on every host in the cluster. Virtual SAN is embedded in the vSphere kernel and typically consumes less than 10% of the compute resources on each host. Virtual SAN does not compete with other virtual machines for resources and the IO path is shorter.
A shorter IO path and the absence of resource-intensive storage virtual appliances enables Virtual SAN to provide extreme performance with minimal overhead. Higher consolidation ratios translate into lower total costs of ownership.
Virtual SAN is built into vSphere. Virtual SAN is enabled with just a few mouse clicks. There is no requirement to install additional software and/or deploy virtual storage appliances to every host in the cluster. Simply click the Enable Virtual SAN checkbox to start the process. Deduplication and compression can also be enabled at that time.
The next step is claiming local storage devices in each host for the Virtual SAN cache and capacity tiers. One or more disk groups are created in each host. Each disk group contains one cache device (flash) and one or more capacity devices (flash or magnetic). Virtual SAN pools these local storage device together to create a pool of shared storage.
The process of enabling Virtual SAN takes only a better of minutes. This is a tribute to the simplicity of Virtual SAN – especially when you compare it to other enterprise-class hyper-converged and traditional storage systems, which typically take much longer to set up.
Virtual SAN 6.2 features a comprehensive health service that actively tests and monitors a number of items such as hardware compatibility, network connectivity, cluster health, and capacity consumption. The health service is enabled by default and configured to check the health of the Virtual SAN environment every 60 minutes.
The Health service is quite thorough in the number of tests it performs. As an example, proper network configuration is essential to a healthy Virtual SAN cluster and there are 11 tests in the“Network” section of the Virtual SAN Health user interface.
If an issue is detected, a warning is visible in the Virtual SAN user interface. Clicking on the warning provides more details about the issue. For example, a controller driver that is not on the hardware compatibility list (HCL) will trigger a warning. In addition to providing details about the warning, Virtual SAN Health also has an“Ask VMware” button, which brings up the relevant VMware Knowledge Base article.
vSphere and Virtual SAN support a wide variety of hardware configurations. The list ofhardware components and corresponding drivers that are supported with Virtual SAN can be found in the VMware Compatibility Guide. It is very important to use only hardware, firmware, and drivers found in this guide to ensure stability and performance. The list of certified hardware, firmware, and drivers is contained in a hardware compatibility list (HCL) database. Virtual SAN makes it easy to update this information, for use by the Health Service tests. If the environment has Internet connectivity, updates can be obtained directly from VMware. Otherwise, HCL updates can be downloaded to enable offline updates.
If an issue does arise that requires the assistance of VMware Support, it is easy to upload support bundles to help expedite the troubleshooting process. Clicking the “Upload Support Bundles to Service Request…” button enables an administrator to enter an existing support request (SR) number and upload the necessary logs with just a few mouse clicks.
Figure x. Virtual SAN HCL Database and Support Assistant
Virtual SAN proactive tests enable administrators verify Virtual SAN configuration, stability, and performance to minimize risk and confirm that the datastore is ready for production use. Three proactive tests are available:
The VM creation test creates and deletes a small virtual machine on each host confirming basic functionality. The multicast test verifies multicast is working properly on each host and performance meets Virtual SAN requirements. The following figure shows the results of a VM creation test.
The storage performance test is used to check the stability of the Virtual SAN cluster under heavy I/O load. There are a number of workload profiles that can be selected for this test as shown below. Keep in mind the storage performance test can affect other workloads and tasks. This test is intended to run before production virtual machine workloads are provisioned on Virtual SAN.
Capacity overviews are available in the Virtual SAN user interface making it easy for administrators to see used and free space at a glance. Deduplication and compression information is also displayed.
Information is also available showing how much capacity various object types are consuming. Note that percentages are of used capacity, not of total capacity.
This list provides more details on the object types in the Used Capacity Breakdown chart:
A healthy Virtual SAN environment is one that is performing well. Virtual SAN includes a number of graphs and data points that provide performance information at the cluster, host, virtual machine, and virtual disk levels. Time Range can be modified to show information from the last 1-24 hours or a custom date and time range.
The performance service is enabled at the cluster level. The performance history database is stored as a Virtual SAN object independent of vCenter. A storage policy is assigned to the object to control space consumption, availability, and performance of that object. If the object becomes unavailable, performance history for the cluster cannot be viewed until access to the object is restored.
The performance service is turned off by default. A few mouse clicks are all that is needed to enable the service.
At the cluster level, the performance monitoring service shows performance metrics for virtual machines running on Virtual SAN. These metrics provide quick visibility to the entire Virtual SAN cluster. A number of graphs are included such as read and write IOPs, read and write throughput, read and write latency, and congestion.
Backend consumption stems from activities such as metadata updates, component builds, etc. For example, a virtual machine with a number of failures to tolerate set to 1 with a failure tolerance method of RAID-1 (Mirroring). For every write IO to a virtual disk, two are produced on the backend – a write to each component replica residing on two different hosts.
In addition to virtual machine consumption and backend performance metrics, disk group and individual disk performance information is available at the host level. Seeing metrics for individual disks eases the process of troubleshooting issues such as failed storage devices.
Virtual Machine Metrics
Virtual SAN performance information for individual virtual machines and virtual disks. Metrics include IOPS, throughput, and latency. The figure below shows virtual disk-level Virtual SCSI throughput and latencies for reads and writes.
Figure x. Virtual Disk Performance
Traditional Storage Management
Traditional storage models utilize LUNs or volumes. A LUN or a volume is commonly configured with a specific disk configuration such as RAID to provide a specific level of performance and availability. The challenge with this model is each LUN or volume is confined to providing only one level of service regardless of the workloads that it contains. This leads to the provisioning of numerous LUNs or volumes in an attempt to provide the right levels or storage services to each workload. Maintaining a large number of LUNs or volumes leads to management complexity. Deployment and management of workloads and storage in traditional storage environments can be time consuming and error prone.
Storage Policy Based Management
Storage Policy Based Management (SPBM) enables precise control of the storage services. Similar to other storage solutions, Virtual SAN provides services such as availability level, striping for performance, and the ability to limit IOPS. Policies that contain one or more rules are created using the vSphere Web Client.
These policies are assigned to virtual machines and individual objects such as a virtual disk. Storage policies can easily be changed and/or reassigned if application requirements change. These changes are performed with no downtime and without the need to migrate (Storage vMotion) virtual machines from one LUN or volume to another. This approach makes it possible to assign and modify service levels based on specific application needs even though the virtual machines reside on the same datastore.
Virtual SAN features an extensive management API and multiple software development kits (SDKs) to provide IT organizations options for rapid provisioning and automation. Administrators and developers can orchestrate all aspects of installation, configuration, lifecycle management, monitoring, and troubleshooting of Virtual SAN environments. This is especially useful in large environments and geographically disbursed organizations to speed up deployment times, reduce operational costs, maintain standards, and orchestrate common workflows.
SDKs are available for several programming languages including .NET, Perl, and Python. They are available for download from VMware Developer Center and include libraries, documentation, and code samples. For example, this Python script can be used to generate Virtual SAN capacity information: Virtual SAN Capacity – Total and Free from vCenter
Virtual SAN APIs can also be accessed through vSphere PowerCLI cmdlets. IT administrators can automate common tasks such as assigning storage policies and checking storage policy compliance. Consider a repeatable task such as deploying or upgrading two-node Virtual SAN clusters at 100 retail store locations. Performing each one manually would take a considerable amount of time. There is also a higher risk of error leading to non-standard configurations and possibly downtime. vSphere PowerCLI can instead be used to ensure all of the Virtual SAN clusters are deployed with the same configuration. Lifecycle management, such as applying patches and upgrades, is also much easier when these tasks are automated.
This video demonstrates a number of operations from creating a new cluster to configuring Virtual SAN using just a few lines of vSphere PowerCLI code:
vSphere PowerCLI: Creating a Cluster and Configuring Virtual SAN
Virtual SAN (VSAN) is an object datastore with a mostly flat hierarchy of objects and containers (folders). Items that make up a virtual machine (VM) are represented by objects. These are the most common object types you will find on a VSAN datastore:
VM Home, which contains virtual machine configuration files and logs such as the VMX and NVRAM files
Virtual Disk (VMDK)
Delta Disk (snapshot)
Memory Delta, which is present when the checkbox to snapshot a VM’s memory is checked
There are a few other objects that might be found on a VSAN datastore such as the VSAN performance service database and VMDKs that belong to iSCSI targets.
Each object consists of one or more components. The number of components that make up an object depends primarily on a couple things: The size of the objects and the storage policy assigned to the object. The maximum size of a component is 255GB. If an object is larger than 255GB, it is split up into multiple components. The image below shows a 600GB virtual disk split up into three components.
In most cases, a VM will have a storage policy assigned that contains availability rules such as Number of Failures to Tolerate and Failure Tolerance Method. These rules will also affect the number of components that make up an object. As an example, let’s take that same 600GB virtual disk and apply the Virtual SAN Default Storage Policy, which uses the RAID-1 mirroring failure tolerance method and has the number of failures to tolerate set to one. The 600GB object with three components will be mirrored on another host. This configuration provides two full copies of the data distributed across two hosts so that the loss of a disk or an entire host can be tolerated. Below is an image showing the six components (three on each host). A seventh component, the witness, is created by VSAN to “break the tie” and achieve quorum in the event of a network partition between the hosts. The witness object is place on a third host.
In this last example of component placement, we take the same 600GB virtual disk and apply a storage policy with RAID-5 erasure coding (Failures to Tolerate = 1). The object now consists of four components – three data components and a parity component – distributed across the four hosts in the cluster. If disk or host containing any one of these components is offline, the data is still accessible. If one of these components are permanently lost, Virtual SAN can rebuild the lost data or parity component from the other three surviving components.
Virtual SAN requires a minimum number of hosts depending on the failure tolerance method and number of failures to tolerate (FTT) configurations. For example, a minimum of three hosts are needed for FTT=1 with RAID-1 mirroring. A minimum of four hosts are required for FTT=1 with RAID-5 erasure coding. More implementation details and recommendations can be found in the Virtual SAN Design and Sizing Guide.
“Fault domain” is a term that comes up fairly often in availability discussions. In IT, a fault domain usually refers to a group of servers, storage, and/or networking components that would be impacted collectively by an outage. A very common example of this is a server rack. If a top-of-rack (TOR) switch or the power distribution unit (PDU) for a server rack would fail, it would take all of the servers in that rack offline even though the servers themselves are functioning properly. That server rack is considered a fault domain.
While the failure of a disk or entire host can be tolerated, what if all of these servers are in the same rack and the TOR switch goes offline? Answer: All hosts are isolated from each other and none of the objects are accessible. To mitigate this risk, the servers in a Virtual SAN cluster should be spread across server racks and fault domains must be configured in the Virtual SAN user interface. After fault domains are configured, Virtual SAN will redistribute the components across server racks to eliminate the risk of a rack failure taking multiple objects offline. This feature is commonly referred to as “Rack Awareness”. The diagram below shows what this might look like with a 12-node Virtual SAN cluster spread across four server racks.
Configuring Virtual SAN fault domains is quite simple as demonstrated in this video (no audio): Virtual SAN Fault Domains
Virtual SAN Stretched Clusters provide organizations with the capability to deploy a Virtual SAN cluster across two locations. These locations can be opposite sides of the same data center, two buildings on the same campus, or geographically disbursed between two cities. It is important to note that this technology does have bandwidth and latency requirements as detailed in the Virtual SAN Stretched Cluster Guide and 2-Node Guide.
A stretched cluster provides resiliency against larger scale outages and disasters by keeping two copies of the data – one at each location. If a failure occurs at either location, all data is available at the other location. vSphere HA restarts any virtual machines affected by the outage using the copy of the data at the surviving location. In the case of disaster avoidance such as an impending storm or rising flood waters, virtual machines can be migrated from one location to the other with no downtime using vMotion.
Since Virtual SAN is a clustering technology, a witness is required to achieve quorum in the case of a “split-brain” scenario where the two locations lose network connectivity. A Virtual SAN witness is simply a virtual machine running ESXi. The witness is deployed at a third location (separate from the two primary data locations) to avoid being affected by any issues that could occur at either of the main sites.
The witness does not store data such as virtual disk (VMDK) objects. Only witness objects are stored in the witness virtual appliance. If any one of the three sites goes offline, there is still more than 50% of each object’s components online to achieve quorum and maintain availability.
Deduplication and Compression
Enabling deduplication and compression can reduce the amount of physical storage consumed by as much as 7x, resulting in a lower total cost of ownership (TCO). Environments with highly-redundant data such as full-clone virtual desktops and homogenous server operating systems will naturally benefit the most from deduplication. Likewise, compression will offer more favorable results with data that compresses well such as text, bitmap, and program files. Data that is already compressed such as certain graphics formats and video files, as well as files that are encrypted, will yield little or no reduction in storage consumption from compression. In other words, deduplication and compression results will vary based on the types of data stored in an all-flash Virtual SANenvironment.
Deduplication and compression is a single cluster-wide setting that is disabled by default and can be enabled using a simple drop-down menu. Note that a rolling format of all disks in the Virtual SAN cluster is required, which can take a considerable amount of time. However, this process does not incur virtual machine downtime and can be done online, usually during an upgrade. Deduplication and compression are enabled as a unit. It is not possible to enable deduplication or compression individually.
Deduplication and compression are implemented after write acknowledgement to minimize impact to performance. Deduplication occurs when data is de-staged from the cache tier to the capacity tier of an all-flash Virtual SAN datastore. The deduplication algorithm utilizes a 4K-fixed block size and is performed within each disk group. In other words, redundant copies of a block within the same disk group are reduced to one copy, but redundant blocks across multiple disk groups are not deduplicated.
The compression algorithm is applied after deduplication has occurred just before the data is written tothe capacity tier. Considering the additional compute resource and allocation map overhead ofcompression, Virtual SAN will only store compressed data if a unique 4K block can be reduced to 2K orless. Otherwise, the block is written uncompressed to avoid the use of additional resources whencompressing and decompressing these blocks which would provide little benefit.
The processes of deduplication and compression on any storage platform incur overhead and potentially impact performance in terms of latency and maximum IOPS. Virtual SAN is no exception. However, considering deduplication and compression are only supported in all-flash Virtual SAN configurations, these effects are predictable in the majority of use cases. The extreme performance and low latency of flash devices easily outweigh the additional resource requirements of deduplication and compression. The space efficiency generated by deduplication and compression lowers the cost-per-usable-GB of all-flash configurations.
RAID-5/6 Erasure Coding
RAID-5/6 erasure coding is a space efficiency feature optimized for all-flash configurations. Erasure coding provides the same levels of redundancy as mirroring, but with a reduced capacity requirement. In general, erasure coding is a method of taking data, breaking it into multiple pieces and spreading it across multiple devices, while adding parity data so it may be recreated in the event one of the pieces is corrupted or lost.
Unlike deduplication and compression, which offer variable levels of space efficiency, erasure coding guarantees capacity reduction over a mirroring data protection method at the same failure tolerance level. As an example, let’s consider a 100GB virtual disk. Surviving one disk or host failure requires 2 copies of data at 2x the capacity, i.e., 200GB. If RAID-5 erasure coding is used to protect the object, the 100GB virtual disk will consume 133GB of raw capacity – a 33% reduction in consumed capacity versus RAID-1 mirroring.
While erasure coding provides significant capacity savings over mirroring, understand that erasure coding requires additional processing overhead. This is common among any storage platform today. Erasure coding is only supported in all-flash Virtual SAN configurations. Therefore, performance impact is negligible in most use cases due to the inherent performance of flash devices. Also note that RAID-5 erasure coding (FTT=1) requires a minimum of four hosts and RAID-6 erasure coding requires a minimum of six hosts.
Virtual SAN has the ability to limit the number of IOPS a virtual machine or virtual disk generates. There are situations where it is advantageous to limit the IOPS of one or more virtual machines. The term “noisy neighbor” is often used to describe when a workload monopolizes available I/O or other resources, which negatively impact other workloads or tenants in the same environment.
An example of a possible noisy neighbor scenario is month-end reporting. Management requests delivery of these reports on the second day of each month so the reports are generated on the first day of each month. The virtual machines that run the reporting application and database are dormant most of the time. Running the reports take just a few hours, but this generates very high levels of storage I/O. The performance of other workloads in the environment are impacted while the reports are running. To remedy this issue, an administrator creates a storage policy with an IOPS limit rule and assigns the policy to the virtual machines running the reporting application and database. The IOPS limit eliminates the performance impact to the other virtual machines. The reports take longer, but they are still finished in plenty of time for delivery the next day.
Keep in mind storage policies can be dynamically created, modified, and assigned to virtual machines. If an IOPS limit is proving to be too restrictive, simply modify the existing policy or create a new policy with a different IOPS limit and assign it to the virtual machines. The new or updated policy will take effect just moments after the change is made.
Virtual SAN is optimized for modern all-flash storage with space efficiency features such as deduplication, compression, and erasure coding that lower TCO while delivering incredible performance. Hardware expenditures are further reduced for 2-node cluster configurations using directly connected crossover cables. This lowers network switch costs, reduces complexity, and improves reliability especially for use cases such as remote offices. The Virtual SAN iSCSI target service enables physical servers and application cluster workloads to utilize a Virtual SAN datastore. The performance and health services make it easy to verify Virtual SAN configurations and closely monitor key metrics including IOPs, throughput, and latency at the cluster, host, virtual machine, and virtual disk levels. Quality of service can be managed by using IOPs limits on a per-virtual machines and per-virtual disk basis. All of these services are precisely managed using VM-centric storage policies. Virtual SAN scales to 64 nodes per cluster with up to 150k IOPS per node using the latest hardware innovations such as NVMe. Deployment options include Virtual SAN Ready Nodes, Dell EMC VxRail turnkey appliances, and build-your-own. All of these utilize components certified by VMware and the leading hardware OEMs to simplify hyper-converged infrastructure deployment. This provides organizations a vast number of options to run any app, any scale on an enterprise-class platform powered by Virtual SAN.