RAID, Backups and Beyond: The Basics of Data Protection

When you invest in the monthly expense of a hosting service, one of the biggest questions in your mind—other than, will my customers be able to get to my site?—is: How protected is my data?

Managing thousands of servers here at ServInt, we’ve seen it all: from the common problem of a VPS customer who accidentally deletes an important file to a host machine head crash physically etching the disk of the hard drive.

Human error and hardware failures are simply part of doing business as a hosting company. Even the most competent webmaster or programmer can inadvertently overwrite a file. And no piece of hardware created can be guaranteed error-free 100% of the time. So how is a customer supposed to protect him or herself from the inevitable need to replace lost or corrupted data?

There are three important—and distinct—data protection options to consider when researching hosting solutions. ServInt recommends customers utilize all three options whenever possible.

1. RAID. RAID stands for Redundant Array of Independent (or Inexpensive) Disks. Simply put, RAID is a way of replicating data over multiple disks such that if one or more disks in the array fails, no data is lost. The simplest form of RAID is RAID 1, which mirrors of one half of the array onto a second. But many other configurations exist including RAID 0, 2, 3, 4, 5, 6 and any number of combinations of these configurations.

Each RAID configurations takes a slightly different approach to duplicating data across an array including striping, mirroring, and dedicated or distributed parity (a form of checksum used to “rebuild” lost data). Different configurations have benefits and drawbacks regarding efficient use of disk space, write speed, recovery characteristics, etc.

This layer of protection—when available—allows the seamless rebuilding of degraded disk arrays and prevents much of the data loss that would otherwise occur when a disk fails. But while RAID is important data protection, it is not a complete backup solution. RAID is a means of writing data to a drive. If one disk (or more, depending on the size and type of the array) fails, the data remains accessible on the other disks while the failed drive is replaced. But if a file is corrupted or manually deleted, the fanciest RAID array in the world won’t bring it back to life. That’s where backups come in.

2. Backups. All data on a server should be protected by backups daily. Any reputable hosting company will provide a competent backup solution to its customers. Many include off-server solutions that combine the backup needs of many host machines onto large storage servers.

Why off-server? How protected is your data if the backups are stored on the same hardware as the main data? To be fair, in-server second drive backup solutions are fine in many cases, but they do require a technician to remove the drive from the chassis in case of restoration to a new piece of hardware, or replacement of the primary drive before data can be recovered from the secondary drive.  Additionally, on-server backups do not allow for central management of backup data and provide a relatively finite amount of backup capacity as compared to networked backup server solutions.

3. Off-Network/Off-Site Backups. If the only concern in our lives were the reliability of host machine hardware, hosting customers would not need to worry about off-network and offsite backups. But this is just not the case. No matter how robust a hosting company’s data protection measures are, multiple redundancies are built into systems specifically because multiple failures can happen.

Every responsible customer should initiate a program to back his or her data up on machines that are completely independent from the host machine’s hardware and network. For most small customers this might be as simple as archiving data through your control panel and downloading it to a home computer. For larger customers, this may mean seeking out a second solution from their hosting provider. Does your provider offer servers on multiple vlans in independent data centers? If not, it may pay to investigate a second hosting company for your secondary backup solution—or to switch hosts completely.

The needs of customers’ off-network/off-site backup solutions are so varied that it is difficult for hosting companies to provide out-of-the-box solutions. For this reason, it is often left up to the customer to be proactive regarding this last line of defense against data loss.

These three levels of data protection just scratch the surface of what’s available to the customer serious about protecting data. But it’s a baseline. If you can confidently say the data protection solution for your online content includes these three levels of protection, you are well on your way to secure deployment. And you are ahead of much of the competition as well!

