Master of disaster: Basic primer offers step-by-step recovery planning for the SMB

Disaster Recovery Planning may seem an abstract concept, but the need for it is critical to ensure business continuity during an interruption. An unexpected event like a flood or fire will activate an emergency plan. We present here a practical guide to disaster recovery, from planning through implementation.

01/21/2009



Disaster Recovery Planning may seem an abstract concept, but the need for it is critical to ensure business continuity during an interruption. An unexpected event like a flood or fire will activate an emergency plan. Presented here is a practical guide to disaster recovery, from planning through implementation.

Your first question: Why develop a plan?

Disasters come in all forms: weather storms, electrical outages, fires, and floods. Enterprises must determine the how an outage would impact the organization. It is the responsibility of senior management to determine the resources that will be invested in disaster recovery planning to ensure success.

Disasters do occur and we must be prepared to respond to them. Over the past 20, I have encountered several outages ranging from simple electrical mishaps to major disruptions spanning over a period of days. Without a plan that was thoroughly tested, there would have been severe repercussions for the business.

Help

A successful disaster recovery plan requires support from senior management. Without this support and cooperation of the management team, the plan will have limited effectiveness when disaster strikes.

A team needs to be established, and a manual must be developed, detailing the response to each type of event .The team should be composed of knowledgeable users and managers from all key areas, including IT. Their goal is to determine the best course of action to be taken when there is an event.

Various scenarios should be reviewed and documented. Forms should be developed that enable the business to manually record transactions for eventual update to the computer systems.

A disaster recovery manual should include these points:

• A contingency plan overview;

• Definition of short-term, mid-term, and long-term events;

• Official policy approved by senior management;

• Statement of critical resources;

• Plan to ensure critical resources;

• Well understood and accepted definition of responsibility;

• Detailed written procedures to accomplish pre- and post-disaster activities; and

• A methodology to ensure that the plan can implemented at a moment’s notice.

Short-term outages are those affecting major business components for less than a day. Mid-term scenarios typically last more than days but have a definitive end. Long-term outages are defined by the need to operate away from the primary facility.

Backup and recovery

The first step in media storage planning is to identify the information on servers and PCs and implement back-up procedures. The same back-up and recovery safeguards that apply to servers must also apply to critical PCs.

PC backup
Today there are many ways to back up PCs and servers. A few of the most economical ways of backing up PCs are USB Disk Drives and Flash Drives. These methods do an adequate job but impose security issues. A preferred approach is to use Network Drives to ensure that user data will be regularly archived.

Server backup

Traditionally, server backup was performed to tape storage devices. Over the past few years, there has been a huge increase in the amount of storage required as well as the need to provide it 24/7. Therefore, alternative approaches may need to be considered. One approach is backing up to remote back-up servers. An alternate approach is backing up data to remote data storage facilities, which was made possible by the advent of high speed encrypted internet access. Backups should be stored off site in a certified secure location.

High availability

Many organizations are not able to afford any “downtime.” High Availability is now more then a buzzword: it’s a requirement. HA requires virtual duplication of server, application, data, and network. Even legacy applications can be modified to support high availability solutions by journaling critical databases.

Short-term planning

Short-term disruptions to a computer center require an analysis to determine the cost benefit of implementing safeguards such as:

• Uninterruptible power supplies (UPS);

• Back-up generators;

• Alternate voice and data communication paths.

The short-term plan must include a contact list of all personnel and key vendors.

Mid-term

In many ways the planning and recovery process is similar to a short-term plan. Today many organizations deploy high-speed circuits for their voice and data. These lines may not be available during an outage. Enterprises should have POTS telephone lines available in case of an outage. High-speed voice and data circuits should be backed up with alternate vendors to enable rapid network migration in the event of a failure. Consideration must be made to the availability of personnel required to respond to mid-term events.

Long-term planning

Long-Term events are the most difficult to respond to and will require temporary relocation of computer facilities and personnel. Agreements with a commercial disaster recovery firm, business partner, or remote plant must be in place and must be tested on an annual basis. Backup configurations must include comparable hardware, software, and communication network. Application keys and certificates may be required.

The business units must test their disaster recovery operational plans on an annual basis.

During a major outage, the key functional areas must be able to continue to do business.

When the site has recovered and is ready to go back online, all systems must be thoroughly tested. The data may need to be reloaded from the back-up site. There should be a predefined process to confirm that the site is ready to go live.

Conclusion
All manufacturing organizations must address the issue of disaster recovery. The best place to start is with a plan that is developed across functional areas. A committee should be established to create and maintain the plan and it must be periodically tested. Everyone involved must thoroughly understand their roles and be in a position to implement then when a disaster strikes.






About the author:

Mark Shurr, VP, Ada Business Technology, has more than 20 years experience as an IT executive, including with a Fortune 1,000 international company. He can be reached at mshurr@AdaBusTech.com





No comments
The Top Plant program honors outstanding manufacturing facilities in North America. View the 2013 Top Plant.
The Product of the Year program recognizes products newly released in the manufacturing industries.
The Leaders Under 40 program features outstanding young people who are making a difference in manufacturing. View the 2013 Leaders here.
The new control room: It's got all the bells and whistles - and alarms, too; Remote maintenance; Specifying VFDs
2014 forecast issue: To serve and to manufacture - Veterans will bring skill and discipline to the plant floor if we can find a way to get them there.
2013 Top Plant: Lincoln Electric Company, Cleveland, Ohio
Case Study Database

Case Study Database

Get more exposure for your case study by uploading it to the Plant Engineering case study database, where end-users can identify relevant solutions and explore what the experts are doing to effectively implement a variety of technology and productivity related projects.

These case studies provide examples of how knowledgeable solution providers have used technology, processes and people to create effective and successful implementations in real-world situations. Case studies can be completed by filling out a simple online form where you can outline the project title, abstract, and full story in 1500 words or less; upload photos, videos and a logo.

Click here to visit the Case Study Database and upload your case study.

Bring focus to PLC programming: 5 things to avoid in putting your system together; Managing the DCS upgrade; PLM upgrade: a step-by-step approach
Balancing the bagging triangle; PID tuning improves process efficiency; Standardizing control room HMIs
Commissioning electrical systems in mission critical facilities; Anticipating the Smart Grid; Mitigating arc flash hazards in medium-voltage switchgear; Comparing generator sizing software

Annual Salary Survey

Participate in the 2013 Salary Survey

In a year when manufacturing continued to lead the economic rebound, it makes sense that plant manager bonuses rebounded. Plant Engineering’s annual Salary Survey shows both wages and bonuses rose in 2012 after a retreat the year before.

Average salary across all job titles for plant floor management rose 3.5% to $95,446, and bonus compensation jumped to $15,162, a 4.2% increase from the 2010 level and double the 2011 total, which showed a sharp drop in bonus.

2012 Salary Survey Analysis

2012 Salary Survey Results

Maintenance and reliability tips and best practices from the maintenance and reliability coaches at Allied Reliability Group.
The One Voice for Manufacturing blog reports on federal public policy issues impacting the manufacturing sector. One Voice is a joint effort by the National Tooling and Machining...
The Society for Maintenance and Reliability Professionals an organization devoted...
Join this ongoing discussion of machine guarding topics, including solutions assessments, regulatory compliance, gap analysis...
IMS Research, recently acquired by IHS Inc., is a leading independent supplier of market research and consultancy to the global electronics industry.
Maintenance is not optional in manufacturing. It’s a profit center, driving productivity and uptime while reducing overall repair costs.
The Lachance on CMMS blog is about current maintenance topics. Blogger Paul Lachance is president and chief technology officer for Smartware Group.