Storage design Considerations- Part 2a- Designing for Capacity (Choosing right RAID level)

In the last 2 posts of this series we had a look into overview of factors affecting your storage design and discussed about designing storage from availability point of view. This post will be focusing on the capacity related factors which you should keep in mind before start configuring your storage devices or jump ahead and start creating your VMFS datastores.

When designing storage from capacity point of view it should be kept in mind that the capacity which we choose should be not only enough for initial deployments but should be scalable as well so as storage expansion in future can go easy without any fuss.

We will start this post with choosing the RAID type for your storage arrays.

Nearly all organization using storage array or SAN in their production environment go ahead with configuring RAID on the array so that disk failures issues in future can be easily dealt with and environment should not suffer any data loss. But the important question is:

Which RAID level should I choose?

The choice of which RAID type to use, like most storage decisions, comes down to availability, performance, capacity, and cost. In this section, the primary concerns are both availability and capacity.

The below picture taken from book “vSphere Design” compares how different RAID types mix the data-to-redundancy ratio:

raid

RAID 0

RAID 0 stripes all the disks together without any parity or mirroring. Because no disks are lost to redundancy, this approach maximizes the capacity and performance of the RAID set. However, with no redundancy, just one failed disk will destroy all of your data.

Raid-0

Raid 0 is never considered as good choice for running virtual workloads because of greater risk associated with loss of data on device failure.

RAID 10

RAID 10, also known as RAID 1+0, combines disk mirroring and disk striping to protect data.

A RAID 10 configuration requires a minimum of four disks, and stripes data across mirrored pairs. As long as one disk in each mirrored pair is functional, data can be retrieved. From an availability perspective, this approach gives an excellent level of redundancy, because every block of data is written to a second disk.

Multiple disks can fail as long as one copy of each pair remains available. Rebuild times are also short in comparison to other RAID types. However, capacity is effectively halved; in every pair of disks, exactly one is a parity disk. So, RAID 10 is the most expensive solution.

Raid10

RAID 10 provides redundancy and performance, and is the best option for I/O-intensive applications. One disadvantage is that only 50% of the total raw capacity of the drives is usable due to mirroring.

RAID 5

RAID 5 is a RAID configuration that uses disk striping with parity.
Because data and parity are striped across all of the disks, no single disk is a bottleneck. Striping also allows users to reconstruct data in case of a disk failure.

RAID 5 has an impact on availability, because the loss of more than one disk at a time will cause a complete loss of data. It’s important to consider the importance of your data and the reliability of the disks before selecting RAID 5. The MTBFs, rebuild times, and availability of spares/replacements are significant factors

raid_5

RAID 5 is a very popular choice for SCSI/SAS disks. After a disk failure, RAID 5 must be rebuilt onto a replacement before a second failure. SCSI/SAS disks tend to be smaller in capacity and faster, so they rebuild much more quickly. Because SCSI/SAS disks also tend to be more expensive than their SATA counterparts, it’s important to get a good level of capacity return from them.

With SAN arrays, its common practice to allocate one or more spare disks commonly termed as “Hot Spare“. These spare disks are used in the event of a failure and are immediately moved in as replacements when needed. An advantage from a capacity perspective is that one spare can provide additional redundancy to multiple RAID sets.

RAID 6

RAID 6 is similar in nature to RAID 5, in that the parity data is distributed across all member disks, but it uses the equivalent of two disks. This means it loses some capacity compared to RAID 5 but can withstand two disks failing in quick succession. This is particularly useful when you’re creating larger RAID groups.

From availability point of view RAID 6 is the best choice but you have to suffer on capacity side as space equivalent to 2 disk size will be wasted in writing the parity bits.

If you are planning to use expensive SCSI or SAS disks then stop and think for a moment that are you ready for wasting the costly disk space with RAID 6 or availability is utmost important for you.

RAID 6

Note: RAID also dictates that how much IOPS you will be getting out of your disks. We will discuss the effect of choosing the RAID level in part-4 of this series where we will discuss designing storage for performance.

Leave a Reply