Machine learning for engineers from time-series data

When data scientists are hard to come by, companies can find themselves at a disadvantage with all the operational data at their fingertips. Four tips for getting started with machine learning are highlighted.

By Sanket Amberkar, Falkonry May 21, 2018

Machine learning for engineers

This challenge is addressed by operational machine learning systems ready for use by industrial SMEs. SMEs can feed existing multivariate time-series data generated by operations into the machine learning system and based its outputs determine a course of action. The approach is effective because:

  • SMEs retain both the control and visibility needed to drive improvements.
  • SMEs are in the best position to determine the corrective action to take.
  • Identifying the use case and verifying the results within an operations team delivers quick time to value.

An example of such an approach is shown in Figure 2. Time-series data can come from a variety of sources and be provided as a real-time data stream to the machine learning system, which in turn analyzes the data to predictions on the equipment, system, or process state based on conditions the SMEs have trained it to find.

What is unique, and why it’s referred to as a “data scientist in a box,” are three key capabilities built into the system:

1. Unsupervised feature learning: Insights in time-series data come from multivariate trends or patterns, reviewed forensically to understand past system behavior. Patterns are often hard to describe and must be learned based on signals across data sources. Traditional analytics are unable to effectively learn or recognize time-series patterns. Feature learning autonomously discovers patterns hidden in time-series data, including patterns that go undetected by human observation. Once the patterns are discovered, machine learning can be applied.

2. Machine learning and predictions: This approach doesn’t just recognize defined patterns, but identifies new ones, correlating the patterns to operational events. From that point, an industrial SME can label those that correspond to known conditions such as a fault or downtime using the system’s log data. An SME can select from other consistent patterns preceding the fault condition and label it as a precursor event (see Figure 3). The precursor event can be an early warning or prediction likely to appear prior to the fault condition occurring.

3. Explanation: Most artificial intelligence and deep learning techniques can’t explain how they derive a prediction. Data scientists must interpret the model and resulting analyses. In comparison, the “data scientist in a box” approach determines which data signals the prediction requires and how much each signal contributed. This is a valuable insight for the sake of credibility and for guidance in determining root cause.

Applications across industries

Since machine learning discovers patterns in available time-series data, it doesn’t require developing mathematical models of physical systems or for suppliers having deep industrial domain expertise. In process operations, consistent, on-spec quality with profitable yields requires insight into real-time process conditions-a step beyond merely maintaining equipment. The time-series data controlling, or produced by, these processes defines the process health, estimates quality output, and reflects process yield. Machine learning delivers the insight to optimize outcomes and avoid machine failures.

Dealing with multivariate time-series data seems complex, but machine learning addresses much of that complexity With “ready to use” operational machine learning, historical data can be selected, patterns discovered, and models built and verified-eliminating the need for a data scientist or third party consultants and thus significantly reducing costs. Putting machine learning in SME’s hands delivers time to value, from identifying use cases to verifying results, because the process resides within the operations team. This approach delivers substantial improvements in asset performance, throughput, operator safety, and product quality.

Four steps for getting started with machine learning

1. Identify a use case and ask:

  • What is the specific problem you are trying to solve?
  • What approaches have you tried to solve this problem?
  • What is the cost of doing nothing?

2. Assess your readiness by asking:

  • Do you collect and archive time series data?
  • Do you have a data historian?
  • Have you prioritized your signals or have a hypothesis on where to look?
  • Have you defined evaluation criteria for success?

3. Implement a pilot by:

  • Provide a dataset for evaluation (for training the model)
  • Take the product training
  • Label the known events identified by the model.

4. Operationalize the pilot results by:

  • Validate the precursor events and alerts identified by the model
  • Assess models’ performance to reliably predict events and provide timely alerts
  • Incorporate predictive analytics into work processes and organizational plans.

Sanket Amberkar is senior vice president at Falkonry.

Original content can be found at Oil and Gas Engineering.