How to build scalable data models with MQTT Sparkplug

A key to bridging the OT/IT gap is enabling successful data modeling, which is how organizations define and organize their business processes.

By Arlen Nipper May 20, 2021

Some companies undergoing digital transformation expect a straight, simple line from operational technology (OT) data to enterprise applications (see Figure 1). They hope to collect the data, add some information technology (IT)/Cloud tooling and achieve a simple internet of things (IoT) solution. In reality, OT data comes from myriad sources with various data types requiring complex IT/Cloud tooling to make sense of it all. OT data needs vary greatly from IT data needs, and companies need a way to satisfy both sides to successfully embrace IoT and digital transformation.

Courtesy: Cirrus LinkOT data consists of proprietary protocols and multiple data formats, varies across market segments and includes no contextual information. The data is designed for operations and is retrieved with poll/response methodology, then is directly coupled to applications over isolated networks.

IT requires data for data objects and modeling, in standard data formats, with contextual information and it must be secure and easy to integrate. The data should be decoupled to the enterprise and is best retrieved with publish/subscribe methodology.

An OT-centric data model

MQTT Sparkplug has been touted as an excellent IoT protocol because it is a lightweight, publish/subscribe network protocol that is simple, efficient, secure and open with no vendor lock-in. MQTT is a message-oriented middleware, so the client connects to the broker and then publishes information. The data is decoupled, so one edge device can publish a metric and 100 applications (or more) can subscribe. The benefits are well documented. However, the purpose here is to focus on one benefit of the Sparkplug B specification — that it defines an OT-centric data model/asset.

Sparkplug is a new specification within the Eclipse Tahu project that defines how to use MQTT in a mission-critical, real-time environment. Sparkplug defines a standard MQTT topic namespace, payload and session state management for industrial applications while meeting the requirements of real-time supervisory control and data acquisition (SCADA) implementations. The Sparkplug B specification provides the data model needed to define a tag value for use with OT, also providing data to IT, making it 100% self-discoverable and easy to consume.

MQTT Sparkplug establishes a single source of truth for models, assets and tags at the edge, enabling OT data from various data sources and protocols and defining it for IT (see Figure 2). When customers are designing an IoT system, when they start their design, it is ideal for the data model to be as far to the edge as possible. Ideally, the data model should be in the device to establish that reliable, single source of truth.

Tags are the only piece of this puzzle typically addressed by IoT platforms and solutions, but MQTT Sparkplug goes beyond tags to create a single source of truth for models and assets as well. Without custom code, scripts, Python, Java or anything else complex and homegrown rarely scales or works long term.

When OT data is collected and then the model/asset/tags are converted to MQTT Sparkplug, the data can be sent to Cloud and enterprise applications for the auto-creation of data models without any programming or coding required. OT data is converted to IT data, then put in a standard interface for Big Data, which leads to scalable data insights and business improvements.

Windfarm example

CirrusLink built a sample use case for the MQTT Sparkplug data modeling capabilities at a windfarm. We connected a wind turbine, added attributes and process variables with MQTT Sparkplug, then created the model in AWS SiteWise. The benefit of this solution is companies can start where the expertise is, at the edge, at the wind turbine and then create the model to be consumed by any third party or Cloud application. MQTT Sparkplug provides the technology to create a model that says, “This is a wind turbine, at this location, with these process variables: windspeed, RPM and direction.” Then MQTT Sparkplug provides a model all the way from the edge to the Cloud for a single source of truth.

Now, any IoT platform, solution or application can be either a consumer or a provider of data models. There is no other technology besides MQTT Sparkplug that allows companies to build a generic data model, then an asset and then populate the asset. Without coding? Unheard of. OPC UA data models compete on some level but you can’t create those yourself. Plus, the true beauty of the solutions is this proper model/asset/tag definition enabled by MQTT Sparkplug allows the solution to be replicated at scale. The unique capabilities built into MQTT Sparkplug to define the data model and asset is proving to be an important differentiator in the IoT marketplace.


Author Bio: Arlen Nipper is president and CTO of Cirrus Link. He brings more than 40 years of experience in the SCADA industry to Cirrus Link as President and CTO. He was one of the early architects of pervasive computing and the Internet of Things and co-invented MQTT, a publish-subscribe network protocol that has become the dominant messaging standard in IoT. Arlen holds a bachelor’s degree in Electrical and Electronics Engineering (BSEE) from Oklahoma State University.