You are hereBlogs / Joseph Kexel's blog / Online Backups

Online Backups

By Joseph Kexel - Posted on 13 June 2009

A client recently inquired about using an online backup service to replace their entire backup requirement. Now having an off-site backup which is done automatically is a great thing. However, there are certain limitations to the technology.

Now, I will not name any specific service at this time, but I will try to explain how such services work and why a good amount planning should go into using these services.

Virtually all of these services will offer encryption. They will boast of "this" or "that" algorithm which makes them the best service available. The algorithm that the federal government has recently searched for and has claimed as its standard is AES. That would be a fine one to have for your data.

Other acceptable algorithms would be Twofish and Blowfish. The advantages of these two are that the government has not made them their encryption of choice. For those who may distrust the government, these may have some benefit over a cipher the government wishes to push down to everyone. AES, though, appears just as strong as these two.

Okay, you found a service which uses AES, Twofish or Blowfish, but you still have some thinking to do. All encryption uses keys, keys similar to those to your home or car. Some providers have the key in your control, meaning the password, passphrase or random text you provide is the key to your data. Obviously, if that is the case they handle your key during access of your data. They may not store it, but their servers know it while you are using it. The other option are sites whose servers offer you an account to access your key, a key they have complete control over. That is like having your neighbor open your house for you each time you come home. They know you are the owner, so they use the the only key available to open your home. How well do you trust your neighbor? Each scenario has issues.

Though the first method may seem secure for you must provide the REAL key each time, you must trust that their software is not misusing it somehow, either for nefarious uses or simply failing to use it safely. The second scenario is even more troubling, for now administrators at such a firm can access your key by accessing your account. Now you are at the mercy of the controls put in place, by them, of your online backup's security. What are they? Your guess is as good as mine, for most sites do not offer much information on that.

To make it simple, if your data must absolutely be kept secret, you must encrypt it on your systems first. Only the encrypted data must be sent to the service.

That sounds simple enough, but that brings up the next issue, "How much can you send with your upstream data plan with your ISP?"

The reason this quickly becomes an issue is many online backup providers use differential backup techniques, much like the open source Rsync. The method is simple, instead of uploading the entire data set each day, you only upload the changes. The way this benefits both you and the backup service is that once you get the bulk of your data transferred, only new stuff will be needed to updated. This vastly increases how much you can have backed up online. When you encrypt your data before transmission, which I recommend, differential backups break down. Even a single bit difference can make an encrypted file completely different and require a full upload to the backup service.

Clearly, you cannot backup more each day than you have upstream data speeds to support. A terabyte of data, for example, would take more than 2300 hours to upload over a 1 megabit upstream connection. You must remember that nearly all DSL or Cable connections are asymetric, meaning your downstream speed is greater than your upstream speed.

So, there is a real limit to how much you can upload each day. Assuming 1 megabit upstream, which is only valid for Cable, most DSLs will be around 384kbit tops, we find the maximum amount of data that can be uploaded during your 12 hour non-work time is 5.6G bytes. That assumes that you can pin your 1 megabit connection without any network delay occurring. A DSL connection has a limit of 2G bytes. For some it will not be an issue, but for others you will never fit your data within that limit.

Now we are talking data, not the OS of your server. A server fortunately does not change too much, but there can still be proprietary information hidden in the registry and other configuration files. With OSs easily taking up over 10G bytes, you will have to make the choice NOT to encrypt the OS in your online backup or you will never have the time up upload your data.

The solution is very simple. You keep your local backups, but use the online backups for just data. There is a glimmer of hope with your data, too. When you use a program like 7-zip or Winzip, they can compress your data along with encrypting it. Some data like databases, file based or exported from modern database engines, will compress very well and uploads of the encrypted data will work great. Another benefit of using an online backup service is that when offsite backup is arranged, tape backups (which should be rotated off-site) may give way to external USB drive backups, which are both faster and do not require human intervention, such as, changing tapes.

Online backup services are a very useful tool in your disaster recovery arsenal, but you must effectively decide what needs to be encrypted before transmission, how much you can realistically upload each night and what your full server disaster recovery options should be. It is important that you deeply evaluate what you need to backup via online backup and how you intend to do so.

So, in conclusion, I recommend local backups of everything to allow the fastest restoration of deleted files and in the event of hard disk failure. Restoring everything, including all server settings, saves a lot of work and money. Online backup services are for the worst case scenario, your facility is destroyed by fire or natural disaster and you will be rebuilding from fresh servers. In that situation, server OS recovery will not be needed, only the true data need be restored. There are other issues, such as what about the service provider going bankrupt. So, it is extremely obvious that online backup will never be the only solution you will need. A broad, overlapping solution will protect you best.