How to effectively operate mission critical facilities

Achieving uninterrupted mission critical facility operation requires both standardized employee training and detailed documentation.


Specialized technicians monitor the data center power and cooling systems, which provides alerts to any system issues. This type of technician is trained to respond to a wide variety of failure scenarios, using the emergency operating procedures that haveMission critical environments are just that—critical to business functions. Without exception, mission critical facilities cannot bear any shutdowns or interruptions. This applies even during planned maintenance, making proper preparation a vital factor to reducing human errors, equipment failure, and downtime. The successful operation of mission critical data center facilities requires process standardization, especially in the important areas of training and documentation. Properly executing these functions in support of equipment maintenance activities can alleviate a primary root-cause of downtime. 

The goal of every mission critical facility is to operate safely, reliably, and efficiently at its design capacity. Most studies of downtime in mission critical environments come to the same conclusion: human error is a leading cause. While there is no way to completely eliminate human error and its negative effects on business productivity, there are a number of steps facility managers can take to greatly reduce its frequency and impact. The most reliable method is to invest in effective documentation and training programs, which will provide the basis for improving accuracy, consistency, and reliability. 

Documentation and reporting

Personnel conduct a vibration measurement on a rotary UPS system. This information is used to perform an analysis as part of Lee Technologies (part of Schneider Electric’s) reliability centered maintenance program. Courtesy: Schneider ElectricNearly all critical facility operations have some level of documentation in place; however, some documentation programs do not meet the needs of mission critical environments. Considering the importance of accurate and current documentation to the reliable operation of the facility, a strong program standard is warranted. Structured documentation programs have a cost that varies according to system complexity, the facility automation scheme, and the level of change management needed to achieve the reliability and uptime goals of the enterprise. 

Mission critical facilities are delivered with a considerable volume of documentation, but effectively sustaining operations is dependent on the right type of documentation. Typically, the detailed procedures needed to perform important daily functions are missing or incomplete. 

Proper documentation requires the following: 

  • Detailed written procedures for all operations and maintenance activities including:
    • Emergency operating procedures (EOP)
    • Standard operating procedures (SOP)
    • Methods of procedure (MOP)
    • Administrative procedures (AP)
  • Site walk-through procedures
  • Facility work rules
  • Change management processes and procedures
  • Accurate and up-to-date drawings and schedules
  • Report templates
    • Weekly, monthly, quarterly reports on facility operations and system capacities
    • Incident reports
    • Failure analysis
    • Lessons learned
    • Near-misses.  


This technician is performing an inspection on a diesel generator fuel system during a preventative maintenance event. Proper training allows for this inspection in lieu of subcontracting the service to an outside source. Courtesy: Schneider ElectricEmployee training should be a priority when new staff is hired, and should be conducted at regular intervals to ensure all personnel are up to date on any changes in industry standards and organizational best practices. Properly trained employees understand how the plant works, how to safely operate and maintain the plant equipment, and what to do when equipment and systems don’t function as expected. Thorough, accurate, and readily accessible documentation is both the foundation of this knowledge and the means of implementing it. However, the establishment of a comprehensive documentation and training program is a crucial, but rarely achieved, goal in mission critical environments. 

What constitutes “proper training”? A best practice approach is to implement a multilevel training program that aligns each site operating procedure to a specific level of certification. This ensures that all operating and maintenance procedures are conducted or supervised by fully qualified personnel. Certification is achieved through a rigorous evaluation program, with regular recertification required. Such a program requires a large variety of materials and methods, such as: 

  • Theory of operation for major equipment and systems
  • Training modules for EOPs, SOPs, and MOPs
  • Drills for EOPs
  • Exams for various training levels 

Personnel are performing a switching procedure using an approved method of procedure. Note that they are using the “pilot-copilot” method of stepping through the checklist. Courtesy: Schneider ElectricIt all starts with the most difficult aspect of any training program: developing the training materials. However, this effort cannot begin without timely and accurate information from the design and construction teams on the equipment configuration, the basis of design, the sequence of operations, and the as-built configuration. While this may seem to be readily available information, often it is poorly documented and late to be delivered. This is a major issue for both the commissioning and operations teams.

The main reason for the lack of effective training programs is the time and expense of development and training activities. This is a short-sighted view, however, as the cost and effort are largely offset by the resulting increased uptime, lower maintenance costs, and decreased employee turnover. The fact is that a proper documentation and training program is as important a consideration to achieving the required facility performance, efficiency, and reliability goals as the quality of the system design itself. 

An effective multilevel training program can be broken down into four certification levels: 

  • Level 1: Basic knowledge and emergency response
    • Level goal: Train an employee capable of properly responding to emergency situations.
    • Training covers:
      • Administrative functions
      • Theory of operation
      • Daily routines
      • Security policies
      • Emergency procedures.
  • Level 2: Intermediate knowledge and frequent procedures
    • Level goal: Provide focused teaching of critical systems in order for the employee to begin participating in routine work practices.
    • Training covers:
      • Technical critical systems equipment knowledge
      • Frequently performed and/or elementary operational procedures
      • Frequently performed maintenance procedures.
  • Level 3: Advanced knowledge and infrequent procedures
    • Level goal: Broaden training to include noncritical systems, and provide additional in-depth training on critical systems.
    • Training covers:
      • Technical noncritical systems equipment knowledge
      • Infrequently performed maintenance procedures
      • Infrequently performed and/or moderately difficult operational procedures.
  • Level 4: Subject matter expertise on specific systems
    • Level goal: Train employees to become subject matter experts so they in turn will be able to train new employees.
    • Training covers:
      • Select, technically difficult procedures throughout the facility
      • Specialized outside training
      • Training course development
      • Training delivery. 

Personnel prepare for a switchgear maintenance performed three times per year. This is a potentially hazardous procedure that can only be performed with a detailed procedure and by personnel with thorough training in this maintenance operation. Courtesy:Training doesn’t end after an employee has qualified and become certified at a certain level. It’s vital to continuously supplement that knowledge with lessons learned from all available sources, particularly the direct experience of the facility technical workforce. This new information is incorporated into the training program and formalized in the recertification process. To test skills and responsiveness, ongoing emergency response drills are conducted that keep employees at peak readiness to handle any emergent events in the mission critical environment. 

Achieving uninterrupted mission critical facility operation requires more than an investment in redundant critical infrastructure systems. It also requires both a financial investment and time commitment in their sustained operation, which stems from properly documenting the environment and training staff in conducting regularly scheduled, standardized maintenance on all facility equipment. 

The cost of these programs should be considered necessary to fulfill the critical mission and to protect the original infrastructure investment. The cost of creating and consistently implementing high-quality employee training and conducting effective maintenance is offset by increased uptime, longer asset life, more efficient system operations, and less employee turnover. 

As senior vice president of critical environment services, Woolley oversees the operation of all on-site facility operations and maintenance programs at data center solutions provider Lee Technologies, a subsidiary of Schneider Electric. He also leads the quality system group, which establishes and continuously improves the company’s service offerings, and is responsible for the company’s environmental health and safety program. He has been involved in the mission critical facilities management field for more than 20 years and has extensive experience in building technical service programs in addition to managing operations for more than 50 data centers throughout his career.

No comments
The Top Plant program honors outstanding manufacturing facilities in North America. View the 2013 Top Plant.
The Product of the Year program recognizes products newly released in the manufacturing industries.
The Engineering Leaders Under 40 program identifies and gives recognition to young engineers who...
The true cost of lubrication: Three keys to consider when evaluating oils; Plant Engineering Lubrication Guide; 11 ways to protect bearing assets; Is lubrication part of your KPIs?
Contract maintenance: 5 ways to keep things humming while keeping an eye on costs; Pneumatic systems; Energy monitoring; The sixth 'S' is safety
Transport your data: Supply chain information critical to operational excellence; High-voltage faults; Portable cooling; Safety automation isn't automatic
Case Study Database

Case Study Database

Get more exposure for your case study by uploading it to the Plant Engineering case study database, where end-users can identify relevant solutions and explore what the experts are doing to effectively implement a variety of technology and productivity related projects.

These case studies provide examples of how knowledgeable solution providers have used technology, processes and people to create effective and successful implementations in real-world situations. Case studies can be completed by filling out a simple online form where you can outline the project title, abstract, and full story in 1500 words or less; upload photos, videos and a logo.

Click here to visit the Case Study Database and upload your case study.

Maintaining low data center PUE; Using eco mode in UPS systems; Commissioning electrical and power systems; Exploring dc power distribution alternatives
Synchronizing industrial Ethernet networks; Selecting protocol conversion gateways; Integrating HMIs with PLCs and PACs
Why manufacturers need to see energy in a different light: Current approaches to energy management yield quick savings, but leave plant managers searching for ways of improving on those early gains.

Annual Salary Survey

Participate in the 2013 Salary Survey

In a year when manufacturing continued to lead the economic rebound, it makes sense that plant manager bonuses rebounded. Plant Engineering’s annual Salary Survey shows both wages and bonuses rose in 2012 after a retreat the year before.

Average salary across all job titles for plant floor management rose 3.5% to $95,446, and bonus compensation jumped to $15,162, a 4.2% increase from the 2010 level and double the 2011 total, which showed a sharp drop in bonus.

2012 Salary Survey Analysis

2012 Salary Survey Results

Maintenance and reliability tips and best practices from the maintenance and reliability coaches at Allied Reliability Group.
The One Voice for Manufacturing blog reports on federal public policy issues impacting the manufacturing sector. One Voice is a joint effort by the National Tooling and Machining...
The Society for Maintenance and Reliability Professionals an organization devoted...
Join this ongoing discussion of machine guarding topics, including solutions assessments, regulatory compliance, gap analysis...
IMS Research, recently acquired by IHS Inc., is a leading independent supplier of market research and consultancy to the global electronics industry.
Maintenance is not optional in manufacturing. It’s a profit center, driving productivity and uptime while reducing overall repair costs.
The Lachance on CMMS blog is about current maintenance topics. Blogger Paul Lachance is president and chief technology officer for Smartware Group.