No disaster planning at all – This is the situation of most companies in the world; disaster has not even been considered and no planning has been done. When a disaster occurs, people become frantic and recovery is difficult if not impossible. Companies without a good disaster recovery plan are living on borrowed time. A disaster will strike: it could be a flood, a collapsed ceiling, an infestation of insects in the wiring, a major earthquake or a terrorist attack.
IT departments without any kind of disaster recovery need to get new management. Disasters can occur at any time and rarely do they give notice, and the lack of a plan can mean the complete destruction of a company. If you work in a company which has an IT department which does not plan for disasters, then perhaps it would be wise to start looking around for one that does.
No disaster plan, but good backup procedures – If you cannot get anyone in your company to buy into disaster planning, then the bare minimum is to regularly (once every day) back up the data on your computers and store them offsite (use an archival company – never store them at employee’s homes). If your IT department is not making good backups of at least the critical systems every single day (at the minimum), then it is simply not doing its job. One important thing to remember about backups: they must be tested upon occasion. Nothing is more frustrating than to need a backup and find that the data is corrupt or non-existent.
IT departments who do not perform this simple step should be fired in mass, as they are placing their entire companies at risk.
Fault tolerance – The next step up in disaster recovery is to build fault tolerance into all of your critical systems. This means installing RAID drives (disk drives which are redundant copies of each other), clustered systems and other types of local recovery procedures.
A plan for disaster without any resources in place – Once you have a good backup and archival procedure and your critical systems are fault tolerant, the next step is to put together procedures for remote disaster recovery. This simply means you ask and answer the question, “what do we do if the computer center is utterly destroyed?” You might, for example, make arrangements with another division or company to share equipment and space if either is struck by disaster. Agreements need to be made with critical computer vendors to quickly ship new systems in the event of an emergency. This kind of planning is a good first step, although recovery would be slow in the event of disaster.
Cold Site – This is a site (often managed by a third-party and shared among multiple clients) which is stocked with equipment and ready to go. However, the machines are not operational, data is not copied on a live basis and time (generally more than 24 hours) is required to bring the site up live. This is a popular disaster recovery method because it tends to be less expensive than other options, yet still gives a company the ability to survive a true disaster.
If you outsource your disaster recovery to a third-party, than odds are they will establish this form of disaster recovery. This will work as long as your planning is good, your backups are sound and your documentation is excellent. Of course, extended downtime in the event of a disaster must be acceptable for a cold site to be a valid option. Plan on twenty-four hours for critical systems and as long as a week for less important functions.
A split site – Some companies are large enough that the IT department could be staffed at more than one location. In the event of a disaster to one site, operations would simply shift to the other. Any needed equipment could be purchased as necessary in the event of a disaster. The advantage to this method is it eliminates the need for the major up-front costs of building a disaster center.
A Warm Site – If your company has the resources and good sense to understand that IT is vital to its survival, then you should be able to at least create a warm site. This is a site which is pre-positioned with equipment, software and other necessities, all ready to go in the event of a disaster. The equipment is idle, often turned off, but can be quickly restored and brought online if needed. Data is quickly available and can be restored without much difficulty.
Companies that go to this level of disaster preparedness are rare; a high level of competence and forward thinking is required plan, build and maintain it.
A Hot Site – In this scenario, a duplicate computer center is set up in a remote location (at least a few miles from the primary computer facility), with communications lines set up and actively copying data at all times. The site has a duplicate of every critical server (at least), with data that is up-to-date to within hours, minutes or even seconds. It also (in the best case) has desks, phones and whatever else is necessary for operations to continue if the worst happens.
This is the ultimate in disaster preparation, reserved for companies with excellent management and highly skilled IT staff. Hot sites are expensive, difficult to set up and require constant maintenance, but in the event of a disaster operations can continue with a minimum of downtime.
The ultimate – Sometimes senior management is very intelligent and understands the criticality of IT and the necessity for the systems to be up quickly in the event of a disaster. With this in mind, you can put together a very well thought out disaster plan.The ideal solution is a hot site which is connected to the main computer facility by a fast communication line. All of the data is copied to the hot site computers over the line in real-time or at staggered intervals . In addition, all of the communications to our stores have backup capability to the hot site as well.
Conclusions – In the ideal situation, the operating basis is simple: you want to have a job and get paid if your computer center is destroyed. It’s very simple, really. If senior management understands the risks and benefits, and is willing to allocate the resource, you can create an excellent disaster recovery plan. That’s the way it should be, as one thing you can count on in life is: there will be a disaster at one time or another. Thus, you had better plan on it.