Master of disaster: Basic primer offers step-by-step recovery planning for the SMB

Disaster Recovery Planning may seem an abstract concept, but the need for it is critical to ensure business continuity during an interruption. An unexpected event like a flood or fire will activate an emergency plan. We present here a practical guide to disaster recovery, from planning through implementation.


Disaster Recovery Planning may seem an abstract concept, but the need for it is critical to ensure business continuity during an interruption. An unexpected event like a flood or fire will activate an emergency plan. Presented here is a practical guide to disaster recovery, from planning through implementation.

Your first question: Why develop a plan?

Disasters come in all forms: weather storms, electrical outages, fires, and floods. Enterprises must determine the how an outage would impact the organization. It is the responsibility of senior management to determine the resources that will be invested in disaster recovery planning to ensure success.

Disasters do occur and we must be prepared to respond to them. Over the past 20, I have encountered several outages ranging from simple electrical mishaps to major disruptions spanning over a period of days. Without a plan that was thoroughly tested, there would have been severe repercussions for the business.


A successful disaster recovery plan requires support from senior management. Without this support and cooperation of the management team, the plan will have limited effectiveness when disaster strikes.

A team needs to be established, and a manual must be developed, detailing the response to each type of event .The team should be composed of knowledgeable users and managers from all key areas, including IT. Their goal is to determine the best course of action to be taken when there is an event.

Various scenarios should be reviewed and documented. Forms should be developed that enable the business to manually record transactions for eventual update to the computer systems.

A disaster recovery manual should include these points:

• A contingency plan overview;

• Definition of short-term, mid-term, and long-term events;

• Official policy approved by senior management;

• Statement of critical resources;

• Plan to ensure critical resources;

• Well understood and accepted definition of responsibility;

• Detailed written procedures to accomplish pre- and post-disaster activities; and

• A methodology to ensure that the plan can implemented at a moment’s notice.

Short-term outages are those affecting major business components for less than a day. Mid-term scenarios typically last more than days but have a definitive end. Long-term outages are defined by the need to operate away from the primary facility.

Backup and recovery

The first step in media storage planning is to identify the information on servers and PCs and implement back-up procedures. The same back-up and recovery safeguards that apply to servers must also apply to critical PCs.

PC backup
Today there are many ways to back up PCs and servers. A few of the most economical ways of backing up PCs are USB Disk Drives and Flash Drives. These methods do an adequate job but impose security issues. A preferred approach is to use Network Drives to ensure that user data will be regularly archived.

Server backup

Traditionally, server backup was performed to tape storage devices. Over the past few years, there has been a huge increase in the amount of storage required as well as the need to provide it 24/7. Therefore, alternative approaches may need to be considered. One approach is backing up to remote back-up servers. An alternate approach is backing up data to remote data storage facilities, which was made possible by the advent of high speed encrypted internet access. Backups should be stored off site in a certified secure location.

High availability

Many organizations are not able to afford any “downtime.” High Availability is now more then a buzzword: it’s a requirement. HA requires virtual duplication of server, application, data, and network. Even legacy applications can be modified to support high availability solutions by journaling critical databases.

Short-term planning

Short-term disruptions to a computer center require an analysis to determine the cost benefit of implementing safeguards such as:

• Uninterruptible power supplies (UPS);

• Back-up generators;

• Alternate voice and data communication paths.

The short-term plan must include a contact list of all personnel and key vendors.


In many ways the planning and recovery process is similar to a short-term plan. Today many organizations deploy high-speed circuits for their voice and data. These lines may not be available during an outage. Enterprises should have POTS telephone lines available in case of an outage. High-speed voice and data circuits should be backed up with alternate vendors to enable rapid network migration in the event of a failure. Consideration must be made to the availability of personnel required to respond to mid-term events.

Long-term planning

Long-Term events are the most difficult to respond to and will require temporary relocation of computer facilities and personnel. Agreements with a commercial disaster recovery firm, business partner, or remote plant must be in place and must be tested on an annual basis. Backup configurations must include comparable hardware, software, and communication network. Application keys and certificates may be required.

The business units must test their disaster recovery operational plans on an annual basis.

During a major outage, the key functional areas must be able to continue to do business.

When the site has recovered and is ready to go back online, all systems must be thoroughly tested. The data may need to be reloaded from the back-up site. There should be a predefined process to confirm that the site is ready to go live.

All manufacturing organizations must address the issue of disaster recovery. The best place to start is with a plan that is developed across functional areas. A committee should be established to create and maintain the plan and it must be periodically tested. Everyone involved must thoroughly understand their roles and be in a position to implement then when a disaster strikes.

About the author:

Mark Shurr, VP, Ada Business Technology, has more than 20 years experience as an IT executive, including with a Fortune 1,000 international company. He can be reached at

No comments
The Top Plant program honors outstanding manufacturing facilities in North America. View the 2013 Top Plant.
The Product of the Year program recognizes products newly released in the manufacturing industries.
The Engineering Leaders Under 40 program identifies and gives recognition to young engineers who...
A cool solution: Collaboration, chemistry leads to foundry coat product development; See the 2015 Product of the Year Finalists
Raising the standard: What's new with NFPA 70E; A global view of manufacturing; Maintenance data; Fit bearings properly
Sister act: Building on their father's legacy, a new generation moves Bales Metal Surface Solutions forward; Meet the 2015 Engineering Leaders Under 40
Cyber security cost-efficient for industrial control systems; Extracting full value from operational data; Managing cyber security risks
Drilling for Big Data: Managing the flow of information; Big data drilldown series: Challenge and opportunity; OT to IT: Creating a circle of improvement; Industry loses best workers, again
Pipeline vulnerabilities? Securing hydrocarbon transit; Predictive analytics hit the mainstream; Dirty pipelines decrease flow, production—pig your line; Ensuring pipeline physical and cyber security
Upgrading secondary control systems; Keeping enclosures conditioned; Diagnostics increase equipment uptime; Mechatronics simplifies machine design
Designing positive-energy buildings; Ensuring power quality; Complying with NFPA 110; Minimizing arc flash hazards
Building high availability into industrial computers; Of key metrics and myth busting; The truth about five common VFD myths

Annual Salary Survey

After almost a decade of uncertainty, the confidence of plant floor managers is soaring. Even with a number of challenges and while implementing new technologies, there is a renewed sense of optimism among plant managers about their business and their future.

The respondents to the 2014 Plant Engineering Salary Survey come from throughout the U.S. and serve a variety of industries, but they are uniform in their optimism about manufacturing. This year’s survey found 79% consider manufacturing a secure career. That’s up from 75% in 2013 and significantly higher than the 63% figure when Plant Engineering first started asking that question a decade ago.

Read more: 2014 Salary Survey: Confidence rises amid the challenges

Maintenance and reliability tips and best practices from the maintenance and reliability coaches at Allied Reliability Group.
The One Voice for Manufacturing blog reports on federal public policy issues impacting the manufacturing sector. One Voice is a joint effort by the National Tooling and Machining...
The Society for Maintenance and Reliability Professionals an organization devoted...
Join this ongoing discussion of machine guarding topics, including solutions assessments, regulatory compliance, gap analysis...
IMS Research, recently acquired by IHS Inc., is a leading independent supplier of market research and consultancy to the global electronics industry.
Maintenance is not optional in manufacturing. It’s a profit center, driving productivity and uptime while reducing overall repair costs.
The Lachance on CMMS blog is about current maintenance topics. Blogger Paul Lachance is president and chief technology officer for Smartware Group.