Understanding Key Performance Metrics of OCZ’s Z-Drive R4 PCIe SSDs

PCIe SSD Blind Performance Test Survey Conducted by Calypso Systems

Scott Harlin
Mark Hayashida

Introduction

PCI Express (PCIe) SSDs are being deployed in the enterprise as primary mass storage, tiered storage, non-volatile flash cache and persistent memory that has resulted in exponential demand for these devices and significant industry growth. Many leading storage intelligence groups forecast that within the next few years PCIe will become the dominant SSD storage platform for the enterprise. With this prevalence, there is a number of PCIe-based enterprise SSDs available to customers each with their own set of features, benefits, capabilities and characteristics. In order to fully understand the landscape of what these products are capable of accomplishing, especially from a performance perspective when accelerating applications or data I/O access, an unbiased competitive matrix is highly desirable.

To fill this customer need as part of its ongoing series of enterprise SSD product comparisons, Calypso Systems, Inc., a leading provider of solid-state solutions testing and measurement, conducted a state-of-the-industry, PCIe Blind Survey which reported performance comparisons (from June 2013 through September 2013) of five leading enterprise-class PCIe SSD edge card vendors. This included drives from OCZ (Z-Drive R4 Model RM88), Fusion-io (ioDrive2), Intel (Model 910), Micron (Model P320h), and Virident (FlashMAX II). It should be noted that all SSDs tested in Calypso’s PCIe Blind Survey were configured with Multi-Level Cell (MLC) NAND flash with the exception of one drive, Vendor 5, which was based on Single Level Cell (SLC) NAND flash.

Z-Drive R4 Half Height PCIe SSD
Z-Drive R4 Half-height PCIe SSD

Z-Drive R4 Full Height PCIe SSD
Z-Drive R4 Full-height PCIe SSD

When compared to these leading enterprise-class PCIe edge cards, OCZ’s Z-Drive R4 was dominant in small block to large block sequential performance, CPU utilization, and application workload performance (regardless of block size, access pattern mix or percentage of read and write operations) even when compared to an SLC-based device. In some cases, the Z-Drive R4 SSD delivered more than 12% faster input/output operations per second (IOPS) and close to 24% faster data transfers than the closest PCIe solution tested as shown in the Calypso results that follow.

As each SSD vendor has their own testing methodology and evaluation process for presenting product specifications and performance to the industry, Calypso developed an

unbiased, industry-standard methodology and process that provides an equal playing field for this type of evaluation. All of the testing for the PCIe Blind Survey was conducted using Calypso’s proven Reference Test Platform/Calypso Test Software (RTP/CTS) which is the official test platform used by the Storage Networking Industry Association (SNIA) in support of the Solid State Storage Initiative’s (SSSI’s) Performance Test Specification (PTS) and is the official SSSI certified PTS testing facility.

The PCIe cards surveyed ranged in capacity from 350GB to 1,847GB and the identity of each card was intentionally blinded for results reporting by Calypso. In this white paper, the OCZ results are evident while the four other product brands are denoted as Vendor 1, Vendor 2, Vendor 4 and Vendor 5. The drives were tested in phases to include synthetic stress tests, SNIA benchmark tests and CTS enterprise application workload tests, and the results ranked PCIe performance, compared specific application workload results, and tested to different types of pre-conditioned write history states such as brand new/Fresh-out-of-the-Box (FOB), relatively new, and well-used (or seasoned).

The purpose of this white paper is to present some of the key findings and test results from the Calypso PCIe Blind Survey that positions OCZ’s Z-Drive R4 Series as one of the foremost SSDs in its class. What the Calypso performance test results demonstrate is that the Z-Drive R4 SSD is particularly strong in small block random performance, large block sequential performance, CPU utilization and application workload acceleration. The paper will briefly review the highlights of the testing, provide testing methodologies and graphically present key results.

Test Procedures

The Calypso PCIe Blind Survey tested industry leading PCIe cards using write saturation (WSAT), performance test specification (PTS) and Calypso test software (CTS) workloads in a variety of pre-conditioned states and test scenarios. This provided Calypso with a full spectrum of comparative performance results when a drive is new, relatively new and well-used.

The Calypso test platform included the following:

The Calypso Blind Survey tested leading enterprise-class PCIe SSD edge cards from five different manufacturers as follows (in alphabetic order):

As industry leaders in solid state storage performance testing, Calypso developed an entire test methodology for this Blind Survey however, for the purposes of this white paper, a quick encapsulation of the survey (versus a deep dive down) is now presented:

  • All tests used synthetic workloads and a device PURGE
    • Synthetic refers to a known and repeatable test stimulus (a fundamental requirement for comparative performance testing)
    • Each test began with a device PURGE so that each drive would be in a state as if no writes had occurred (and used to establish a repeatable starting point)
  • All tests were conducted at the block I/O level to minimize the impact of software, drivers and system file cache
  • The Phase I WSAT tests covered saturated performance in a brand new (or FOB) use state with no write history
  • The Phase II basic PTS benchmark tests covered IOPS, throughput and latency in a well-used write history state
  • The Phase III advanced PTS benchmark tests covered CPU utilization in a well-used write history state
  • The Phase IV CTS workload library tests applied user application workloads in a relatively new write history state (that was not FOB but prior to write saturation)
  • Sequential data is data that is accessed in a predetermined order (or sequenced) versus random data which is data that is accessed given any coordinate location in a population of addressable data elements

Test Results

Write Saturation Tests

In a fresh out-of-the-box (brand new) write history state, the Z-Drive R4 SSD delivered the best sequential write performance, regardless of whether the block size was large or small, as outlined from the following test results:

IOPS Saturation Test

100% Sequential Write Operations (4k blocks)

Write Saturation 4k Sequential Workloads

Bandwidth Test

100% Sequential Write Operations (128k blocks)

Write Saturation 128k Sequential Workloads

The tested 1.6TB full-height (FH) Z-Drive R4 SSD is regarded as one of the highest performing PCIe SSDs currently in the market and utilizes an on-board processor to manage critical internal functions that lead to superior performance aggregation, significantly higher throughput and reduced burden on the host CPU. OCZ’s Virtual Controller Architecture (VCA) bypasses traditional storage overhead, reduces latencies, increases throughput, and enables efficient processing of mass quantities of data, and from the Calypso test results, transferred over 2,400 MB/s of data while completing over 289K IOPS.

CPU Utilization Tests

In a well-used write history state, the Z-Drive R4 SSD provided one of the lowest utilizations of the CPU in a small block size workload where the majority of data access supports random write operations as outlined from the following test results:

CPU Utilization Test

35% Sequential / 65% Random Write Operations (4k blocks)

CPU% Utilization (est) 65%:35% Random Write Workloads

The Z-Drive R4 SSD’s efficient processor architecture and driver stack (VCA) enables more data to be serviced by the SSD while reducing the amount of CPU (and memory) overhead needed to manage the SSD and storage stack. This means that the server’s CPU utilization will be low for the Z-Drive allowing the processor to perform more tasks and do more physical work from the applications.

Application Workload Tests

In a reasonably new write history state (not FOB and not fully saturated), the Z-Drive R4 SSD consistently delivered either the best or second-best performance when compared to the competition for the majority of Phase IV application workload tests regardless of block size, access pattern mix (random versus sequential), or operational mix (percentage of reads and writes performed).

For these sets of tests, Calypso used its CTS Workloads Library consisting of enterprise application workloads that were tested in two groups. Group A predominantly tested small block random (RND) workloads while Group B primarily tested large block sequential (SEQ) workloads. The CTS access patterns were defined by common industry usage and/or by customer definition as follows:

Group A

Group A Application Workloads

Group B

Group B Application Workloads

Group A Test Results

For the first two tests covering Web Server workloads and MS Exchange Mail Server workloads, a common trait with growing cluster and application page sizes is to configure those sizes to use 64k blocks. Microsoft recommends using 64k blocks for all Exchange, Exchange Database (EDB) and log files as part of their best practice model. The following examples depict results that are indicative to these types of application workloads:

Web Server Workloads (64k blocks)

25% Sequential / 75% Random   95% Reads / 5% Writes

Web Server 64k Workloads

Exchange Mail Server Workloads (64k blocks)

100% Random   67% Reads / 33% Writes

Exchange Mail Server 64k Workloads

The next set of tests cover logging workloads that are fundamental core components of any high transaction oriented platform architecture. Relational database management systems (RDBMSs), such as Microsoft SQL Server, depend on reliable high-speed persistent storage to address logging that often becomes a key area of I/O data contention. OCZ Z Drive R4 SSDs can alleviate this congestion efficiently due to its NAND controller design within its VCA Technology. The following examples depict typical results that are indicative to these types of logging workloads.

Web Server Log Workloads (8k blocks)

100% Sequential Writes

Web Server Logs 8k Workload

SQL Server Log Workloads (8k blocks)

100% Sequential Writes

SQL Server Logs 8k Workload

The next set of tests covers real-time OnLine Transaction Processing (OLTP) applications. OLTP workloads typically require smaller random blocks of mixed workload performance where the industry standard test mix is 70% read operations and 30% write operations. OLTP applications stress large multi-user activity of concurrent data accessing and real-time data management that requires relatively fast and small changes/updates to the underlying database. The following example depicts results that are indicative to this type of application workload:

OLTP Server Workloads (8k blocks)

100% Random   70% Reads / 30% Writes

OLTP Server 8k Workload

As delivered by the Group A test results above, OCZ’s Z-Drive R4 SSDs demonstrate excellent performance for applications utilizing small block I/O transfers and is ideally suited for applications that require uncompromising mass data transfers. As the block sizes increase for the next set of large block test results examined below (64k, 512k, 1MB and 2MB), the performance potential of OCZ’s Z-Drive R4 platform becomes further realized.

Group B Test Results

This next set of test results define an application profile that targets or services much larger application block size data transfers due to their associated workloads. A good example of this are Decision Support Service (DSS) applications that manage larger amounts of data in huge table structures that are heavily indexed, and by doing so, places lots of pressure on the storage stack to deliver this mass of data in real-time. Application workloads tested in this area include DSS and Video on Demand (VoD) as outlined from the following test results:

Decision Support Service Server Workloads (64k blocks)

100% Random Reads

Decision Support Service 64k Workload

Video on Demand Server Workloads (512k blocks)

100% Sequential Writes

Video on Demand Server 512k Workload

Applications work best when they are unbounded by the amount of server main memory available to them. It is very common in physical and virtual environments to not have enough main memory to service all of the applicationcritical data 100% of the time. In these cases where the address requested by the application is not in memory, the OS must request the data from underlying storage and transfer it into main memory. A good percentage of the OS configuration allocated for this operation is performed at moderate block sizes, in a sequential pattern. A drive that can move data quickly into main memory provides a huge benefit for these application workloads that must page data often from storage. Application workloads tested in this area include OS paging, media streaming, archival and medical imaging as outlined from the following test results:

OS Paging Server Workloads (64k blocks)

100% Sequential   90% Reads / 10% Writes

OS Paging 64k Workload

Media Streaming Server Workloads (64k blocks)

100% Sequential   98% Reads / 2% Writes

Media Streaming Server 64k Workload

Archive Server Workloads (2,048k blocks)

5% Sequential / 95% Random   55% Reads / 45% Writes

Archive Server 2MB Workload

Medical Imaging Server Workloads (1,024k blocks)

95% Sequential / 5% Random   5% Reads / 95% Writes

Medical Imaging Server 1MB Workload

The Z-Drive R4’s ability to efficiently aggregate data throughput across multiple NAND flash devices on a single PCIe edge card provides the best solution to alleviate I/O contention for these applications. As applications require larger block size data transfers, the Z-Drive R4 demonstrates even better performance versus the other tested products. Essentially, the larger the block size the better the Z-Drive R4 performance becomes over competitive solutions as it has the ability to better utilize the PCIe pipe due to more parallelism inherent in OCZ’s VCA Technology.

Conclusion

As each SSD vendor has their own testing methodology and evaluation process for presenting product specifications and performance to the industry, Calypso developed an unbiased, industry-standard methodology and process to enable as much of an equal playing field as possible for this type of evaluation though one of the competitive devices tested was based on SLC NAND flash while the remaining four enterprise SSDs utilized MLC NAND flash. Though SLC NAND flash provides faster write performance natively, one can see from these series of tests that the OCZ Z-Drive R4 with MLC NAND performed better than the SLC drive (Vendor 5) in many of the workload and use cases tested. Workloads where the OCZ Z-Drive R4 dominated the other vendors in the Calypso tests include:

  • Small and large block sequential performance in an FOB write history state
  • CPU utilization in a well-used write history state
  • Application workload performance in a reasonably new write history state
  • (regardless of block size, access pattern mix or percentage of read/write operations performed).

One of the contributing factors to OCZ’s overall success through the Calypso PCIe Blind Survey results is the company’s proprietary Virtual Controller Architecture (VCA) Technology that provides a complete storage subsystem to efficiently manage all transactions while virtualizing Z-Drive R4 resources into a massive parallel array of memory. It works in conjunction with OCZ’s storage driver architecture to provide performance aggregation across the NAND controllers via an intelligent complex command queuing structure that utilizes both native and tagged command queuing to enable command switching and load balancing based on proprietary algorithms.

This differential advantage, as supported by the Calypso PCIe Blind Survey results, delivers balanced drive loading while maximizing internal bandwidth to achieve near-linear performance aggregation, accelerated application performance with significantly faster response times, and reduced I/O and bandwidth bottlenecks without burdening host CPU or memory resources.

About Calypso Systems

Calypso Systems, Inc. designs, develops and sells sophisticated test and measurement instrumentation used in the characterization of solid state storage device performance and has been providing high performance test equipment to the mass storage industry since 1991.

The company is actively involved in industry standards works and has assumed a leadership role in developing and implementing performance standards. This included chairing the SNIA Solid State Storage Technical Working Group (TWG) as well as developing the SNIA Solid State Storage Performance Test Specification (PTS). They are the official Solid State Storage Industry (SSSI) certified PTS testing facility. Calypso is also on the Flash Memory Summit (FMS) Advisory Committee and is the primary organizer of the FMS Test & Measurement conference track.

Calypso maintains close technical and working relationships with industry leading providers of SSDs, controllers, and NAND flash memory, as well as resellers and system integrators, to investigate and explore the latest in solid state storage products and technologies. As part of its ongoing series of enterprise SSD product comparisons, Calypso has completed Blind Surveys covering SATA SSDs (2009), SAS SSDs (2011), and PCIe SSDs (2013).

About The PCIe SSD Blind Survey 2013

The Calypso PCIe SSD Blind Survey 2013 is available for purchase and immediate downloading. Over 200 tests were performed in a 3-month period of testing. The deliverables include:

  • 20-page executive summary (overview)
  • 250-page slide deck with analysis and charts
  • 410-page appendix with SNIA formatted test results

The Executive Summary provides a general overview of the Blind Survey, an introduction to the survey tests and presents a summary of selective test results. Detailed tests results are presented in the companion PCIe Blind Survey PowerPoint slide deck with over 400 pages of appendices that contain selective standardized SNIA Report format pages.

All encompassing, the survey provides:

  • A comparison between the sample pool cards on PTS benchmark tests
  • A general comparison of enterprise-class PCIe SSD card performance
  • A comparison of tests at FOB, PTS steady state saturation and CTS seasoned pre-conditioning
  • A ranking, value and score by test

For more information on the Calypso PCIe SSD Blind Survey 2013, or to purchase/download the report, visit info@calypsotesters.com.

DOWNLOAD PRINTABLE VERSION (PDF)