Laying the groundwork for disaster recovery
So you think you want to create a disaster plan for your company? I don’t know about you, but the World Trade Center attacks on September 11th sure made it obvious about the need for such a plan. I’d venture to guess that those companies in the area which had good plans are still in business, and those that didn’t are either defunct or on their way down. A disaster plan is that important – it’s that critical. In fact, a good disaster plan can mean the difference between the survival of a company and it’s quick and painful death.
Taking A Look
Now what do you do? The first step is to determine where you are as far as disaster recovery goes. Take a walk around your computer room and just look at things. Examine your computers, your facilities and your offices. What do you see?
You may see some computers, of course. Look closely at them. Are these computers being backed up regularly? Is RAID installed and operating properly? Do you monitor them on a regular basis? Is their power protected by a UPS and/or a generator? How safe is the wiring? Do you have any redundancy in your computers, disks, network equipment and so on?
Also look at the computer room and related facilities. Is your computer room clean and well air-conditioned? How well would the room survive in the event of a fire? Is the room reinforced?
How about the power and communications? Are they well protected and redundant? How fast is your network? Are you using CAT5 or something else?
Look very hard at your backup strategy without making any assumptions about its value. Is it really working? Would you bet your job on the strength of your backups? Regardless of your answer, grab one of your backups and restore some data – see what you get.
Now look around the office. Are there papers and floppy disks and other essential materials all over the place? How well would the company survive without this information? What about paper files, phone and address books, even daily planners? Are these needed for daily operations?
Look At The Environment
Don’t forget to look at the environment where your company is located. A good place to start is with the exterior of your building itself. Put on your work clothes and crawl around, looking at everything.
Is your building new or old? Does it meet modern construction codes or is it substandard in some way? Is it located in a skyscraper or on the ground? How much infrastructure is on top of your building.
Many years ago I was a consultant who was called in to help with a major disaster at the Public Broadcasting system. Their computers were located in the basement of their multi-floor building in Washington DC (I think it was 17 floors). The basement was well protected and seemed very safe, even if the entire building collapsed.
Unfortunately, a fire had broken out on the top floor of the building. Naturally the fire department put out the fire with water. Guess where all of that water went? Right. Into the basement. The poor computers were totally soaked, and no one even had a chance to turn them off before the water hit them. Fortunately, PBS was smart enough to have hired a company to maintain a spare site and thus they had their computers back on the air within a couple of days. I was called in to help evaluate whether or not the old, waterlogged computers could be salvaged.
Okay, back to the environment. Expand your horizons and look around the neighborhood. Are the buildings close together or far apart? How good are the transportation systems?
Is your office close to major medical, fire and other emergency systems? Is your local fire department well-trained and adequately staffed? Does your city even have a disaster plan of any kind?
For that last question, you may be surprised to find out that the entire disaster plan is simply “we’ll handle it” or “that’s someone else’s problem” or something to that effect. Don’t worry about any of this – your job is just to gather the facts. You will deal with them later.
Evaluate The Risks
As you are looking think about the risks. Where are you at risk?
In California, we have problems with the generation of electrical power. Thus, one of our major, almost weekly, risks is that of short duration (one hour or less) blackouts and brownouts. Computer equipment is dependent upon power so that’s an obvious area which needs to be protected. Remember you need to not only protect against lack of power – you also need to be sure your systems are not damaged by surges and spikes.
For our company, we actually paid for an analysis by a geology student to determine how much we were at risk for an earthquake. Our company is based near Los Angeles, which seems to imply that we will have earthquakes – but how big could they be? How long will they last? Our study went so far as to include maps of the various faults in the area. We needed this data to determine where to place our disaster site.
What about weather? A few years ago we determined we had extreme risks from the El Nino effect which, according to the meteorologists, would produce some unusually strong and unpredictable weather. We thus were forced to think about the weather, even though here in California we normally don’t have to plan on major weather damage.
Don’t forget the small stuff – that’s actually what will cause the greatest headaches later on. Look around and see if you have the possibility of damage from vermin such as rats and insects. I remember that as a consultant I once had to troubleshoot why a computer system kept crashing. It turned out the wires were being chewed upon by a family of mice that loved the warm conditions inside the cabinets.
Figure Out What You Want To Do
Don’t worry about what you think you can do at this point. Instead, put together a report, intended just for yourself, of what your ideal situation would be. Be brutal and write it up as if you had no monetary limitations whatsoever.
Start small and work your way up. You’d probably want to ensure that all of your systems had excellent backups, for example, and perhaps RAID disks protecting the data. You might want a 100 megabit network with CAT5 cabling throughout the building. Don’t forget about off-site storage of your backup tapes.
Work bigger now. Does the computer room need corrections? Perhaps you would like to install a big UPS or stand alone generator to protect against power interruptions. If your office is communicating across a WAN, you might want to put in redundant trunk lines, say a dialup ISDN to use as a backup here and there.
Now work even higher up. What happens if the entire building is destroyed? You’d probably like to have a spare site ready to go. Do you want a fully operational hot site? If so, put it into your list.
Don’t forget the fact that computers are useless unless people can use them. Would you put workstations in the hot site? What about the users data? How does that get there?
Finally, be sure to include the larger picture in your outline. If your city is not prepared for disasters, then what can you do? Is there an opportunity to push the city into better disaster planning or does your company need to go the whole mile on its own?
Start Putting It Together
Unless you have a very unique situation, you most likely will not get funding to do everything that you want at once. In fact, you may find your boss or other managers to be downright hostile towards the idea of disaster recovery. After all, there is no direct, immediate, tangible benefit to all of this.
In my experience, the most likely outcome is simple indifference. No one will want to fund anything and you will have to work your butt off just to get people to talk about it at all.
How do you handle this? It’s not that hard really.
The first thing to understand is there are lots of things you can probably do, assuming you have some authority, without too much trouble. Just look at your list and find something that you can do. Anything – it’s not important what that thing is. Just pick something.
You might start by upgrading your backup plans. This is very simple if you’ve done some tests and can prove that they do not currently work or do not work very well. Perhaps all you need to do is buy a better tape drive and some tapes. Congratulations – you are now better able to survive a disaster.
Okay, follow that with coming up with the procedures for backup and offsite archival. Put those procedures into practices and test them thoroughly. Now you are even better able to survive a disaster.
Pick another thing from your list. Maybe your main applications server is not protected by RAID. Go ahead and buy a RAID controller and some disks. Install them. You’ve just reduced the possibility of disaster with this server.
What are you doing? You are fixing some of the problem areas. You see, a disaster plan is absolutely useless unless you’ve got your act together operationally. Your backups must be perfect, you systems must operate well, your procedures must be well done.
You are also building up your own confidence and making yourself look good. If you improve your backup procedures, your reputation will go up when you can restore the data lost in a crash. If you install a UPS on the servers, then you will look very good when the power fails.
As it becomes obvious that you are making the correct decisions, and those decisions are proved, then you can move to the next phase – selling the boss on a real disaster plan.