Hosting Theory

Evaluating Disk Storage
Part 1

Do you run SATA or SAS Drives in your servers?

We hear this question all the time from inquisitive consumers who want to know more than simply how many gigs of disk space they will have access to with a particular server package. But some customers who ask this question are under the false assumption that simply knowing the type of drive tells you whether a drive is good or bad. This ignores some important considerations when evaluating disk storage.

When considering disk space and configurations, it is important to focus on both what kind of drive is used and what kind of an array it is a part of. In this post we’ll focus on the kinds of drives available in today’s market.

Drive Characteristics

When evaluating individual hard drives, there are four major factors to consider:

  1. Capacity
  2. Speed
  3. Reliability
  4. Power Efficiency

Capacity

Capacity is easy, it’s the amount of space available on the drive. A modern drive is composed of one or more platters which spin at a relatively high speed. Most manufacturers hide part of the space on the drive and use that space to “remap/replace” blocks which go bad or become unreliable. This allows the drive to look “perfect” to the computer, until the drive runs out of remapping space, at which point things go rapidly downhill.

Modern drives use a system called “SMART” (Self-Monitoring, Analysis, and Reporting Technology) to let the computer check the health of the drives on a real-time basis. If your computer operating system has SMART enabled, it should be able to warn you when the drive starts to run out of spare blocks, but before you start seeing uncorrectable errors.

Speed

Input/Output (or I/O) speed is where things get a bit trickier. You have to take a couple of different factors into account to analyze speed. A drive has two major transaction modes, reading and writing, and two styles, random or sequential. A drive that is very fast for sequential reads/writes can be very slow for random reads/writes. As you may have guessed, random read/write speed is better for applications where data is continually being accessed and changed in many different files whereas sequential read/write speed is more advantageous for things like backup storage, where large amounts of data are being written or accessed in order.

Also, a drive will typically have some cache associated with it, which is memory used on the drive to speed up transactions so they don’t have to wait for the platter rotation before the computer can move on. This makes the drive faster for “smaller” amounts of data, but it slows down again if the amount of data gets too large and exceeds the cache size.

Reliability

It’s a truism that every drive will fail. So, you want a drive that will give you as much time as possible without any errors, and as much time as possible to replace it once errors arise. The trade-off tends to be reliability versus capacity, where extra capacity is taken away to provide more reliable operation (more remapping sectors, lower bit density, etc).

Power Efficiency

Power efficiency is critical in a datacenter environment and is a huge cost of doing business for a hosting company. A faster drive typically uses more power than a slower drive. A bigger drive typically uses more power than a smaller drive. And don’t think that power efficiency is not something to be concerned about as a hosting customer. Even if the environment does not rank high on your list of business concerns, an inefficient drive that takes more power to operate, costs more to operate. And that cost will inevitably be passed onto the consumer.

Drive Types

Now that we’ve looked what considerations are involved in disk storage, let’s look at the types of drives available. There are four modern drive technologies:

  1. SATA (Serial Advanced Technology Attachment)
  2. SAS (Serial Attached SCSI)
  3. Near-Line SAS
  4. SSD (Solid State Disk).

SATA

SATA gives you the most space, thus winning “$/GB”—which also makes it power efficient when storing data—but tends to be slow, giving the lowest I/O performance ($/IOP).

SAS

SAS is the evolution of SCSI, which became the technology for enterprise class storage, emphasizing speed and reliability over capacity.  SAS drives have a “bit-error-rate” that is 10 times better than a SATA drive (1 in 1016 for SAS vs 1 in 1015 for SATA).

SAS drives are held to a higher reliability standard, in part by reducing the available drive space to increase reliability. The largest SAS drive is typically 1/4 the size of the largest SATA drive. The mean-time-between-failures for SAS is about 25% longer than for SATA drives.

The raw disk IOPs peak about 3x higher for SAS vs SATA and seek times—how long it takes a read head to reach the desired data on a platter—for SAS are about 1/2-1/3 that of SATA. This makes them typically “faster” than comparable SATA drives.

SAS gives you the best $/IOP and the lowest number of drive failures.

Near-Line SAS

A related technology, near-line SAS, uses SATA drive hardware combined with a SAS interface, which gives SATA drive sizes, with the speed and reliability of a SAS interface (up to 10-30% faster than normal SATA, with the appropriate controller). This gives you better $/GB, but with a higher risk of drive failures.

SSD

SSD such as flash drive technology in iPods/iPads and USB flash drives has limited drive space, so it does not win $/GB, but it has very high performance ($/IOP). Unfortunately, due to the limited number of write cycles on the flash memory, the reliability is not where it needs to be for enterprise use in environments with a large number of write operations.

SSD currently has multiple orders of magnitude fewer writes per sector, compared to magnetic hard drives. In consumer SSD devices, a method called write-leveling spreads the writes around the whole unused area of the drive, thus keeping any individual sector from getting too many writes. In an enterprise environment, there are far fewer unused sectors, thus write-leveling is less effective. However, this is constantly improving, so SSD is on the way to becoming viable.

So far, so good. But before the full story can be told, we must consider the technology behind the disk array and its configuration. RAID arrays and controllers allow for much higher capacity, performance, and reliability than individual drives. And some of the surprising benefits and limits of each type of drive in a RAID array dictate the optimal choice in any given application. So check in next time for part two of Evaluating Disk Storage.

Photo by daniel spils

Find out more about ServInt solutions

Starting at $25

  • Hosting Advice
  • The New York Times
  • The Hill
  • Bloomberg
  • The Seattle Times
  • Computer World
  • Ars Technica
  • MSNBC