big-data-analytics-data-lake

Data lake technologies: Bridging the gap between OT and IT

May 12, 2021
The right data management tools help facilities connect, collect and contextualize plant data.
Industrial plants and facilities have put a strong emphasis on digital transformation to increase operational excellence. However, as the information technology (IT) and operational technology (OT) groups work together and try to bring these new initiatives online, they often run into barriers impeding progress.

OT architectures can complicate the efforts of IT personnel trying to properly gather data for widespread use throughout the enterprise. OT infrastructure frequently contains a wide variety of legacy equipment and systems, creating data silos and limiting efficient movement of data across the plant or enterprise. Moreover, OT environments are constrained by legacy systems that often do not integrate seamlessly with new equipment, making it difficult to scale, and less cost-effective to implement new technologies such as analytics.

Many organizations are embracing data lake technologies to better communicate with the wide variety of OT systems and equipment. These provide flexible connectivity to leverage the organizations’ initial investment, enabling greater visibility across the enterprise.

Organizations face many important choices when searching for the right data lake solution—one that will not only serve them well today, but also provide scalability to serve well into the future. To implement the best data lake solution, organizations should focus on the way technology will help connect, collect, and contextualize the data they rely on for continued reliability and operational success.

Data management software provides out-of-the-box connectivity


Most organizations are trying to increase and improve analytics, and many are even moving toward centralized monitoring and reporting to drive operational excellence. However, one of the key complications that IT groups face when trying to connect data across the plant or across the enterprise is the wide variety of new and legacy equipment using varying integration frameworks.

Engineering, reliability, maintenance, operations, research, and other departments may all need to collect and share OT data. Due to variations in data types and storage formats, delivering this collaboration often requires many connectivity software packages for security, buffering, tunneling, bridging, and redundancy management. This frequently results in a complex and difficult to maintain connectivity patchwork (see Figure 1).

Figure 1. Traditional OT architecture typically requires complex engineering via various connectivity packages to provide data flow among all functional areas. (Source: Emerson)

Moreover, when solutions are created, they are typically difficult to connect to emerging technologies, such as cloud analytics. Many older OT systems don’t use the modern integration frameworks that IT departments rely on for cloud connectivity. As a result, IT must perform makeshift development to make these connections, complicating management, security, and reliability. In the most severe instances, these solutions can exceed the capabilities of the OT systems’ infrastructure, causing occasional outages.

Data lakes solve the complex web of OT system connectivity problems, without the need to rip and replace old systems that are reliably performing essential tasks. These data lakes can be deployed either locally or in the cloud, and the most advanced lakes come with a wide range of out-of-the-box connectivity solutions, providing connections to nearly any OT system (see Figure 2). Organizations relying on infrastructure with performance limitations have the flexibility to throttle data rates to ensure collection doesn’t interrupt OT systems’ operation.

Figure 2. Advanced data lake solutions provide out-of-the-box connectivity to ensure all functional areas can connect quickly and easily to the data they need. (Source: Emerson)

For organizations operating in areas with limited communication infrastructure or with unique cybersecurity concerns inhibiting the ability to move data offsite, advanced data lake software can be implemented flexibly on-prem or in the cloud, with the ability to move from one format to the other. This allows an organization to meet security and regulatory needs by storing data in the cloud, locally, or a combination of the two. As circumstances change, so can the infrastructure.

Unlock efficiency with automatic data collection and storage


Today, many plants rely on the historian to gather critical plant data. However, historians have significant limitations for anyone working outside the facility or the process engineering group. Typically, historians are not ideal for managing a wide variety of data types. Historian licensing by tag, the most common method, is cost-prohibitive for organizations trying to monitor many data points.

In addition, a historian is typically most effective in collecting and storing numerical data. While this is valuable for some functional areas, it leaves out many of the data types that functional groups rely upon such as photos, videos, spreadsheets, and more.

As a result, non-numerical data is not stored in the historian, or when stored is not easy to extract and use. This leaves many groups without access to data in the historian, or if they have access, it is only to a small sliver of the data they need. IT and OT are thus forced to manage a wide variety of systems, and these groups must develop secure solutions for the transfer of data among these systems.

Data lakes deliver much more flexible collection and storage capability. Eliminating tag-based licensing means teams can collect and store data at a much lower price. In addition, advanced data lakes provide automatic aggregation of data from a wide variety of sources, with storage in a central repository (see Figure 3).

Figure 3. Modern data lakes like Emerson’s PlantwebTM Optics automatically aggregate data from many different sources and file types to ensure uniformity and make it easy for users to find everything they need in one system. (Source: Emerson)

Automated aggregation reduces required data access effort because users don’t have to open multiple applications to locate data about a particular asset. It also increases efficiency and security at both the individual and plant level because data doesn’t have to be manually transferred, often on insecure devices such as flash drives. Data is also less likely to fall victim to human error in collection and transcription.

Data lakes typically use more advanced data collection methods than historians. New database technologies such as NoSQL solutions improve flexibility and scalability as compared to a historian. The databases used by data lakes easily store unstructured data and scale as the database grows, providing an improved user experience via faster loading and retrieval of data.

Moreover, advanced data collection systems can provide valuable functionality for resource-starved organizations. Data lakes in these systems can connect directly to a CMMS to close the loop on maintenance. Even small maintenance teams can quickly identify problems, schedule repairs and see the results of those repairs—all from one system.

Draw meaning from data with contextualization


Having access to large amounts of data is not enough because organizations also need ways to assign meaningful context to data. Limited consistency of data across the organization can make it difficult to draw conclusions leading to meaningful change.

Modern, flexible data lakes allow organizations to contextualize data in a hierarchical model. Instead of looking at one string of numerical data on the historian and comparing it manually with data from other systems, teams can process historical, asset, inspection, and CMMS data in one system (see Figure 4).

Figure 4. Modern data lakes can automatically contextualize data using advanced analytics and intuitive displays to help users make better decisions faster. (Source: Emerson)

Having all plant data in one system enables organizations to use modern IT tools, even if the systems supplying them aren’t nearly as modernized. It also unlocks new strategies and solutions to improve performance across the enterprise, and to empower personnel to act on the data.

Today’s data lakes standardize data to make it effectively system agnostic. Standardized data can then be sent to nearly any application, or it can be automatically connected to the system’s built-in analytics tools for seamless contextualization. Personnel across the enterprise can manage data and establish and drive key performance indicators and other metrics—which are automatically sent to users’ preferred device (mobile, tablet, desktop)—all from one system.

Standardization in a single system is particularly useful when organizations have a wide array of instrumentation and valves that need to operate at peak performance to reduce downtime and emissions, ensure safety and drive optimum production. These devices are often widespread and hard to interconnect, and plant personnel do not have a way to cross-correlate all their data.

Advanced data lakes enable users from many different functional areas to run and view reports on cross-enterprise data from wherever they are. From their desk or from the field, users can track and trend performance to confirm proper configuration, evaluate performance based on equipment manufacturer or operating conditions, and more.

Improved efficiency and visibility across the enterprise


Bridging the gap between OT and IT is key to maintaining the flexibility necessary for a competitive advantage in today’s global market. Data lakes improve collaboration and decision making without requiring plants to rip and replace legacy equipment or manage complex infrastructure. The resulting connectivity improvement supports digital transformation initiatives and empowers IT and OT teams to work together to break down silos and put actionable advice and metrics in the hands of users on the plant floor and across the enterprise.

This story originally appeared in the May 2021 issue of Plant Services. Subscribe to Plant Services here.

About the Author: Vineesh Kapoor

Sponsored Recommendations

Arc Flash Prevention: What You Need to Know

March 28, 2024
Download to learn: how an arc flash forms and common causes, safety recommendations to help prevent arc flash exposure (including the use of lockout tagout and energy isolating...

Filter Monitoring with Rittal's Blue e Air Conditioner

March 28, 2024
Steve Sullivan, Training Supervisor for Rittal North America, provides an overview of the filter monitoring capabilities of the Blue e line of industrial air conditioners.

Limitations of MERV Ratings for Dust Collector Filters

Feb. 23, 2024
It can be complicated and confusing to select the safest and most efficient dust collector filters for your facility. For the HVAC industry, MERV ratings are king. But MERV ratings...

The Importance of Air-To-Cloth Ratio when Selecting Dust Collector Filters

Feb. 23, 2024
Selecting the right filter cartridges for your application can be complicated. There are a lot of things to evaluate and consider...like air-to-cloth ratio. When your filters ...