LSI Cachecade 2.0 Evaluation - The benefits of caching
15 November 2011 Source: In house
LSI Cachecade 2.0 promises to be answer to some of the IT manager's serious pain points with storage specifically those related to performance and random I/O. This new functionality is injected into a number of the existing line of LSI SAS 6Gbit RAID controllers and adds SSD caching functionality to the already impressive feature set to help improve performance and negate some of the pitfalls of traditional hard disk technology.
Today it's possible purchase magnetic hard disks with capacities of up to 3TB for under £300 but they struggle to provide 200 IOPS; yet some SSD's easily achieve 50,000 IOPS but cost 5-10 times that for only a few hundred gigabytes of capacity.
LSI Cachecade™ and other similar hybrid storage pools enable IT managers to take the main benefit of magnetic disk £ /GB, and marry it with the main benefit of SSD IOPS performance It's a no brainer.
Implementation of the new cache feature coined "Cachecade™" is very simple; The RAID controller even continues to operate with the same functionality, drivers and toolset as before. The latest firmware update is required to enable the Cachecade™ feature so existing users will need to upgrade this in advance however to utilise the feature customers must purchase the Cachecade™ license and apply it to their controller or purchase a controller with this preinstalled. If it is to be purchased separately this is provided as key code, a binary file or in some cases as a physical hardware key.
Each controller can then be equipped with up to 512GB of standard S-ATA or SAS SSD's from LSI's validated selection and assigned as a virtual caching drive; this can be as simple as a single disk or up to as many as 32 SSD's in a RAID volume. Whilst RAID may not be for everyone for critical applications at least 2 SSD's are likely to be used in a RAID 1 mirror alongside a standard BBU to protect the controller cache.
New or existing RAID arrays utilising standard magnetic/spinning hard disks can be enabled for Cachecade™ by associating these RAID volumes with the new SSD pool. From here the array data is automatically cached using the pool for reads and optionally for write operations.
The controllers officially supporting Cachecade™ are as follows
LSI MegaRAID 9260 Series
LSI MegaRAID 9265 Series
LSI MegaRAID 9280 Series
LSI MegaRAID 9285 Series
Additionally the Supermicro range of AOC-U/SAS2H8 controllers can be enabled for Cachecade™ with the appropriate soft or hardware key.
Test configuration
- Supermicro 846E16 Storage Chassis
- Supermicro X8DTH-6 Motherboard
- Dual Xeon X5670 CPU's
- 8GB DDR-3 Registered ECC Memory
- LSI 2108 RAID Controller - Firmware 12.13.0-0104 - Driver 5.1.112.64 - Cachecade 2.0™ Enabled
- 4 x Western Digital WD2002FYPS 2TB Green Power S-ATA HDD
- 1 x Toshiba MK1001GRZB (0106) SSD for Cachecade
- 2 x Toshiba MK1001GRZB (0106) SSD for Cachecade in RAID 0 for Cachecade Stripe
- Windows 2008 R2 x64 Standard
- IOMeter 2006.7.27 Tests performed with 8 Workers, 8 outstanding I/O's per worker
Testing Methodology
For simplicity we will focus on the four corners of disk performance testing although extensive testing has also been concluded; these are
4K Random Read - Simulating random small block I/O similar to database / OLTP applications with a significant read bias.
4K Random Write - Simulating random small block I/O similar to database / OLTP applications with a significant write bias.
1MB Sequential Read - Simulating high throughput applications with a read bias such as video playback.
1MB Sequential Write - Simulating high throughput applications with a write bias such as data or video capture.
A RAID 5 using 4x WD2002FYPS drives was created and tests were conducted in 4 different caching scenarios to help understand the benefits of Cachecade™.
- Cache on - The controller's DRAM cache, read ahead, IO cache and disk cache were enabled.
- Cache off - The controller's DRAM cache, read ahead, IO cache and disk cache were disabled.
- Cachecade™ - Hot - The controller's DRAM cache, IO cache and disk cache were enabled, read ahead disabled and Cachecade™ 2.0 was enabled in Write Back mode (read and write). Additionally IOMeter was instructed to only use the first 62 million sectors of the RAID 5 volume to limit the target areas to around 30GB to simulate a hot spot of activity. 3 benchmarks were run to fill the cache in advance and results were collected on the 4th. This is simulating a working dataset which is small enough to reside permanently in the SSD cache.
- Cachecade™ - Cold - The controller's DRAM cache, IO cache and disk cache were enabled, read ahead disabled and Cachecade™ 2.0 was enabled in Write Back mode (read and write). IOMeter was instructed to use the whole disk and spread the transfers across any of the sectors in the whole 5.45TB RAID 5 volume - simulating a dataset much greater than the SSD cache. 3 benchmarks were run in an attempt to fill the cache in advance and results were collected on the 4th.
Benchmark results
The benefits of Cachecade™ are clear to see on this first test. Random reads are a very difficult operation as you cannot predict exactly which data you will need next. Read ahead algorithms can predict data which has a pattern or is sequential but not completely random access. Where the dataset is already hot and in the cache we can see the benefits of SSD straight away, more than 150 times the IOPS of a non Cachecade™ volume.
Interestingly, where the dataset is much larger than the SSD cache there is little or even no improvement in performance as the number of cache hits are low due to the unpredictable access pattern.
Again, Cachecade™ shows impressive performance benefits over standard disk arrays for hot cache data which shows an astonishing more than 20 times improvement in performance due to the high IOPS and low latency provided by the SSD cache. What is achieved is a significant gain for a relatively modest investment.
Interestingly the read throughput tests show a different story when using a single SSD - Cachecade™ appears to hinder these types of workloads. This is simply explained, we have moved our data from several hard disks which are actually suited well to sequential work loads onto a single SSD - as a result the performance is now somewhat reliant on the characteristics of that SSD. One SSD's througput is lower than that of 4 magnetic disks working together. One way to improve the performance and gain the best of both worlds is to add more SSD's and create a RAID 0, or better still a RAID 10.
Our last test really shows the benefits of having a write cache, it's well known that with no write caching at all write performance is painfully low as every write is commited to disk and confirmed before the next transaction starts. The standard RAID controller and disk caches give excellent an excellent boost to performance uncached array - we can see in a synthetic test how the DRAM controller cache can handle more than 2GB/s of throughput a significant figure. Cachecade™ is able to show a benefit over an uncached array however again the performance is slower than without cachecade for these purely seqeuential operations. Notably performance is better with a cold cache, possibly as more of the cache is available for write operations.
Conclusions
Cachecade™ is not suited to every workload, however in those cases where gains can be had the benefits are significant. Our benchmarks are synthetic and won't excactly replicate a real world environment, but they can help us to generalise and make assumptions based on different workloads.
If random reads or writes in specific hot spots of data are a pain point for your application then. Cachecade™ is a simple addition to your server which can reap performance gains of 20 to 150 times. All this is available for what is essentially a modest investment which consumes little power and requires no specialist knowledge to deploy.
On the other hand, if your workload is biased towards sequential disk access you may be better off ignoring Cachecade™ altogether as you might even lose performance. In those cases, striping several SSD's together to achieve better throughput will help achieve the benefits of both worlds.
In reality, most workloads have a mixture of both random and seqentual access which means that there is likely a benefit to be had for most applications, the question is does the performance justify the cost?