The Next Frontier of Data Center Virtualization: Flash in the Era of Distributed Storage
A Deeper Dive into the Capabilities of OCZ’s VXL 1.2 Software Delivering the Next Revolution of Storage Architectures for Virtualized Environments
Allon Cohen, PhD
As more and more data centers reap the benefits of flash, server virtualization is heading towards its next revolution – treating host-based flash as an integral ingredient of the virtualized environment and utilizing it as a high-performance shared resource. IT managers are coming to realize that flash resources can and should be treated in exactly the same way that virtualized CPU and memory resources are handled, and not as a dedicated resource of a local host. Optimally, flash resources should be dynamically provisioned according to the needs of several to hundreds of Virtual Machines (VMs) in a single environment. With flash becoming an intrinsic part of the virtual environment, IT professionals are facing a new challenge in the evolution of storage and flash virtualization.
With the general availability of its next release of VXL Storage Acceleration and Virtualization Software (Version 1.2), OCZ Technology Group launched the next-generation of flash integration architectures for virtualized data centers that can be implemented today. This new release combines advanced application-optimized caching with distributed volume virtualization of on-host flash that enables such key virtualization services as synchronous mirroring, High Availability (HA), end-to-end Fault Tolerance (FT), and vMotion™ (without the loss of cache data).
Acceleration, Resiliency, Resource Sharing and Ubiquity represent the four key requirements of flash-based SSDs for virtualized data centers.
VXL delivers the ‘next frontier’ of flash virtualization by treating flash as a virtual resource and dynamically distributing OCZ Z-Drive R4 PCIe Solid-State Drive (SSD) flash across application VMs based on need to maximize performance. Though the PCIe SSD is located in one server, the flash resources can be shared amongst any application VM that resides on this server or on other servers in the virtualized cluster. By transparently distributing unallocated flash as a dynamic cache resource, VXL assures optimal flash utilization at all times, regardless of how many VMs are running concurrently.
This white paper provides a deeper dive into the differentiated capabilities of VXL and addresses the software’s ability to deliver advanced virtualized services for site/data recovery, live VM migration, and the dynamic sharing of the SSD flash resources by any VM in the virtualized cluster. As many server applications rely on storage devices to provide critical services, VXL is designed to not only move data onto host-based flash to achieve performance improvements, and efficiently utilize host resources, but also to provide those storage services critical to enterprise applications running anywhere across the virtualized data center as depicted in Figure 1.
The Data Overload Problem
Before we dive into this latest release of VXL, let’s revisit current virtual data center architectures and the key challenges they face today.
Traditional standalone servers have typically been replaced with virtualized compute infrastructures that have enabled the reduction of dozens to hundreds of these servers with just a few powerful, multi-core servers running up to hundreds of VMs each. Though less standalone servers are required, this phase of virtualization has left the storage layer pretty much unchanged, and Storage Area Network (SAN) and Networked Attached Storage (NAS) systems remained a separate tier, lacking significant vertical integration with the virtualization platform.
When server virtualization was added to an IT environment, all application data was typically placed in a SAN or NAS appliances to retain the ability to dynamically run any application load from any data center server. While the CPU cores in the servers could generate millions of input/output operations per second (IOPS), a typical HDD can only deliver between 100 and 200 IOPS performance. Servicing the storage requirements of virtualized servers with thousands of mechanical HDDS quickly became inefficient and wasteful in terms of data center CAPEX (capital expenditures) and OPEX (operating expenses). As more virtual servers are added to the data center, the HDDs within the SAN cannot keep up with the server workload demands of today’s hyper-connected world.
As virtualization takes over the data center, many applications run concurrently on servers and their combined storage access requests are blended together by the virtualization layer. Running multiple VMs together in the virtualized environment randomizes the aggregate I/O stream going to the storage array, so the once sequential I/O patterns of these applications are replaced with randomized I/O forcing IT professionals to face a paradigm shift in how they manage their storage arrays.
Concurrenlty running multiple VMs in a virtualized enviroment requires strong random access capabilities which is a major problem for HDDs.
Known as the ‘I/O blender effect,’ this is an inevitable occurrence with virtualization. For this reason, server virtualization requires strong random access capabilities that continue to be a major problem for HDDs and their physical design. See Figure 2.
Data centers are now experiencing a rapid revolution in the architecture of their virtualized environments with this unbalanced server/storage architecture rapidly changing. Flash has already disrupted the storage landscape with the dramatic introduction of high-performance PCIe flash storage that yields hundreds of thousands of IOPS and terabytes of storage capacity as close as possible to the CPU, and with comparable I/O access requirements to the virtualized servers, unleashes the full power of server virtualization. As a result, flash storage is now triggering the next wave of virtualization becoming an integral and indispensable part of storage provisioning.
Storage, like other virtualized resources, will be provisioned by key services provided and will typically include the storage access parameters that any particular VM will receive (such as I/O performance and bandwidth) with a software layer that transparently uses the dynamically-resourced flash and HDD storage to service the VM at the defined service levels. To enable this, on-host flash must be used as a distributed cache resource, sharable between VMs, and accessible between multiple hosts. Critical for performance acceleration, the caching policies must interface with this shared resource (via application programming interfaces or APIs) to assure the right data is on flash at the right time.
The VXL Software Solution
VXL maximizes application performance in virtualized server environments by delivering data caching and flash virtualization into hypervisor platforms. In conjunction with the Z-Drive R4 PCIe SSD series, VXL distributes the flash caching resources on-demand based on VM need, making sure that no VM inefficiently occupies flash when it can be better used elsewhere in the environment. Flash volumes are automatically provisioned from host flash, with unallocated flash transparently distributed as a dynamic cache resource. No matter how many VMs are running simultaneously, VXL optimally utilizes flash at all times.
Virtualization powered by OCZ VXL Software and Z-Drive R4 PCIe SSDs.
Flash volumes and cache-accelerated volumes (from internal HDDs and external SANs) are presented as distributed network resources accessible by any VM residing on the local server or on other servers in the virtualized cluster. With its unique ability to monitor all data requests, VXL can reduce data traffic to and from the SAN by up to 90%, storing critical data locally in the Z-Drive card. See Figure 3.
It is important to note that many competitive products are limited to accelerating only the application running on the same server that the cache resides, whereas VXL’s unique flash virtualization capabilities allow the cache to be exposed to more than one attached server.
Diving Into VXL Software
In a nutshell, VXL deploys as a virtual appliance into a virtualized environments and works directly with the hypervisor layer to manage and distribute on-host flash resources. Unlike other caching solutions, VXL does not require software agents in the accelerated guest VM. This alleviates one of the biggest challenges for the IT manager when deploying such a solution in a modern data center with hundreds of VM guests.
This centralized approach enables VXL to treat the entire flash capacity as a single virtual resource, dynamically distributing flash resources between the VMs according to need. By combining the power of storage virtualization with dynamic flash caching, and by working centrally rather than with each local VM, VXL efficiently takes full advantage of flash at all times.
VXL's Advanced Caching Algorithm
One of the major benefits of VXL Software is its advanced application policy-based algorithms that enable IT professionals to select from a set of optimized application-specific caching policies to make knowledgeable selections of what data to store in the cache.
The VXL caching algorithms take into account the specific needs of each VM and their priorities based on the application policy, and each policy combines the data collected from a complete storage access heat map with the application storage access DNA.
VXL then combines the policy selection engine with ‘read-ahead, read-around’ algorithms that keep the data in the cache relevant at all times. Sequential reads can be marked for either filtering or read-ahead into the cache based on the application-specific policy. The accessed heat map data is then combined with the sequential read detections and other I/O parameters to determine the best global cache data selection taking into account the requests from the collection of connected VMs. The result provides high ROI in a virtualized environment where many VMs share the same flash and often do not reach peak work load requirements at the same time.
As a network distributed architecture, VXL allocates for each VM its dynamically optimized share of flash resources during peak usage regardless of its location in the network.
Business-rule Cache Warming
VXL features a unique ‘business-rule’ pre-warming cache engine that automatically pre-warms the cache in advance of important and demanding jobs, assuring that the relevant application data resides in cache in time for use by the application.
The virtual data center exhibits pre-determined activity cycles that generate peak I/O performance requirements at certain times, with different I/O profiles generated by different applications. As an example, a data warehouse job may run every night, and a VDI (virtualized desktop infrastructure) boot storm may occur every morning. These two applications would normally compete for flash cache resources, and this competition typically results in neither of them receiving the highest possible cache hit ratio.
VXL features a unique ‘business-rule’ pre-warming cache engine that adapts the flash cache to the business cycle in the data center. This enables IT managers to automatically pre-warm the cache in advance of important and demanding jobs, assuring that the relevant application data resides in cache in time for use by the application. Using the example above, the VDI boot data would be fully loaded in cache in the early morning hours, as the data warehouse hot areas would be fully loaded in cache late in the evening. In between these important jobs, various other jobs benefit from the dynamic cache resource and yet another way of utilizing the flash resource to its full potential.
Advanced Virtualized Services Without a SAN
As more flash-based SSDs are deployed in the data center, IT professionals have come to realize that accelerated server applications, like any other application, rely on specific capabilities of the supported storage devices to provide critical services such as synchronous mirroring, High Availability, end-to-end Fault Tolerance, and vMotion. In some cases, these critical services are a prerequisite for running certain server application loads.
Shared SAN storage enables many key features of the virtual server environment such as live VM migration, distributed resource management and site/data recovery, but with these capabilities comes additional resources and expenses required to deploy, maintain and upgrade. The cost efficiencies enabled by virtualization become compromised by the addition of more storage systems. What if the benefits of virtualization could be realized without having to use more external appliances, or better yet, not having to use a SAN at all? The result would be an all-silicon SAN-less virtual infrastructure that eliminates physical hosts from having to connect to an external shared storage pool.
To enable a ‘no data loss/no VM downtime’ environment, IT professionals require that all virtualized data be synchronously mirrored and continuously available across the network, and to accomplish this both HA and FT services are needed, especially for mission-critical VMs. VMware Fault Tolerance is one of the most demanding features of virtualized environments enabling continuous, non-interrupted availability of an application even during total server failures. VXL is designed to not only move data onto host-based flash to maximize performance and efficiently utilize host resources, but also provides those critical storage services required for operating enterprise business applications.
VXL mirroring for High Availability and Fault Tolerance.
New capabilities within VXL enable Z-Drive R4 flash volumes to be virtualized and synchronously mirrored making them continuously available to support advanced HA and end-to-end FT services from within the virtualization host without the need for any back-end SAN or storage appliance. To enable this level of all-silicon ‘SAN-less’ virtualized services, VXL synchronously replicates flash data through mirroring (between Z-Drive R4 cards residing on redundant cluster servers), keeping two live identical copies of VM data (on two separate ESX hosts), down to the last bit. See Figure 4.
As soon as a failure occurs and the host server realizes that the original VM is down, the FT process kicks in, and a shadow VM takes over. Leveraging VMware Fault Tolerance, in combination with VXL’s flash virtualization and synchronous mirroring capabilities, the application in jeopardy picks up right where it left off delivering application availability, uninterrupted flash acceleration, no cache misses, no data losses and no VM downtime.
The net impact of a SAN-less, flash-only infrastructure is that the virtual environment becomes significantly simpler to manage on a day to day basis. Since all data is stored on Z-Drive R4 flash inside of each server, IT managers of cloud and enterprise data centers get a unified host building block with flash-level storage performance, as well as lower power and cooling expenses compared to traditional HDDs.
VMware vMotion Support
VXL Software fully supports vMotion, enabling transparent and dynamic VM migration from one server to another, without the loss of cache. In competitive solutions, once a VM is migrated to a new server, it will lose connectivity to its flash cache, and that source application will experience a sharp drop in performance until data is reloaded over time to the new cache, as the cached data on flash from the source system will be lost.
In contrast, VXL treats the cached data as a virtualized storage entity that can be continually accessed between ESX servers whenever VMs are migrated. As VMs migrate from the local ESX host to a remote host, VXL identifies the VMs that are remotely serviced and transfers the local flash connectivity to remote connectivity, eliminating the drastic performance drops that occur in other solutions due to the loss of cache.
"For many environments that already have an investment in a SAN, the SAN-less, all-flash architecture is an ideal refresh option. Instead of buying a new storage system, IT can leverage the existing one to focus on capacity centric operations such as those designed to store older VMs and their data. Once the Z-Drive R4 card and VXL are installed, users can leverage vMotion to migrate to the most active VMs to PCIe flash storage in their respective hosts giving these production VMs a massive performance boost, a new level of availability, and a lessened load on the legacy storage system."
Remote Consolidated Management
VXL Software includes a central management application called StoragePro that provides remote consolidated management of the virtual environment and associated storage devices. The StoragePro graphical user interface (GUI) can manage multiple VXL implementations operating on a large number of virtualized servers located across the network. The application enables IT managers to administer their accelerated volumes from a single centralized GUI, and can assign acceleration or caching policies, as well as to create and allocate flash volumes on any connected server. The StoragePro application also allows IT professionals to monitor VXL connectivity and performance in real time for easy optimization, fault detection and troubleshooting.
VXL Software is leading the way in turning today’s industry vision of intelligently distributing virtualized flash resources across a network of servers into reality. From its infancy, it has been designed to provide an optimized solution for virtual environments by dynamically provisioning and delivering caching services per-VM across the entire virtualized environment. VXL 1.2 delivers a trailblazing approach to flash virtualization by combining application-optimized caching with virtualized, highly available network-distributed flash volumes to deliver flash performance without compromising VM functionality.
VXL implements caching policies and turns the PCIe flash storage (sitting in the ESX host) into a shared resource across all servers in the data center. It resolves storage access bottlenecks, enables up to ten times the number of VMs on the same physical host, and delivers a winning combination of superior performance, greener IT enterprises, lower total cost of ownership (TCO) and higher ROI. The software also provides IT managers with the next wave of storage virtualization. Its unified virtualization architecture delivers flash as an integral, indispensible part of virtual environments, evolving the dynamic storage requirements of virtualized data centers to the ‘next frontier.’