If I have OT detection, why do I need incident response?

Image by Pete Linforth from Pixabay

Danielle Jablanski and Dr. Emma Stewart

Detection and incident response have far-reaching implications in critical and interdependent sectors. An attack on transportation may impact everything from fuel delivery to water purification to the manufacturing supply chain. In the operational technology (OT) space, detection and incident response looks very different today compared to IT. Innate concerns and potential process failures are the exact precursors that allow threat actors to execute attacks in OT environments.

Learning from significant attacks, the weakest link in an organization may be a compromised OT system, frivolous internet connectivity, insecure remote access, or an IT system critical for business operations. Cyber incidents that impact OT are inherently less repeatable than IT attacks, and accessible cyber threat intelligence for threat actors and TTPs that specifically target industrial control systems (ICS) is less widely available as there are fewer known and analyzed incidents.

Instead, common concerns frequently include things like unpatchable systems, delayed system updates, firewall misconfigurations, and negligent insider threat scenarios where change management, documentation, and other integration and implementation processes have deteriorated or failed. These common concerns often create cyber events worth remediating, to prevent opportunities for threat actors to exploit them in a major incident.

Detection and action can prevent an intrusion from progressing into successful threat actor objectives. That is what makes the difference in the headline below – an attack thwarted vs. an incident suffered. Understanding prevention metrics for reliability and performance of OT cybersecurity solutions provides customers with the context for taking suggested action – “saving the organization from significant operational interruption.”

*Detection bulletin example from Intelligent Buildings*

What’s Top of Mind

Owners and operators are expected to make timely, objective, cost-effective purchasing decisions amid a sea of available, slightly different security products. All the while asking themselves, what more could be done between prevention and firefighting? Are you forced to choose between integrated response solutions and incident response capabilities?

Threat modeling is routinely used to assess attack vectors, tools, and malware in the context of your organization. While it’s important to pay attention to what adversaries are capable of, assessing mission-critical status, defining security objectives, collecting evidence, and performing analysis from common cyber concerns is arguably just as if not more important.

Response is different in OT because the potential downtime can be considered unacceptable. There may be a future redesign, but in the meantime, owners and operators will settle for an operational system with a vulnerability rather than no operational system at all. OT incident response that is context-specific and provides metrics for reducing the severity of direct impacts is needed, with focus on root cause analysis and anomaly detection for communications and process control variables.

According to Omdia, integrated response capabilities are “the response capabilities that the solution offers, whether that’s a proprietary technology such as micro-segmentation, or options to facilitate and orchestrate response with the customer’s existing technology.” In network security we tend to think of integrated response as the capacity to immediately identify and eliminate a threat, though this process is less automated within OT/ICS systems.

What’s Next

In many older control networks there are legacy systems that can no longer be serviced or replaced. This is an irrefutable truth in generation plants, distribution, and many non-energy sectors. Each has some system that will not be fixed or replaced if it is not operationally broken. If this component is considered critical, there is cost associated with production downtime and the risk of unintended consequences to the system.

If you’re responsible for OT security it may feel like your head is on a swivel, reacting to the latest standards and regulatory requirements, tracking nation-state capabilities, and continuously running risk management programs with competing priorities in mind. Risk ownership and visibility gaps abound with mounting third party and supply chain considerations.  

Continuity of uptime is the enemy of prioritized event management for OT security. Some owners and operators are perceived as not “not doing enough” cybersecurity because they are often forced into recovery frameworks concerned narrowly with system restoration and backups, disparaging potential early detection, containment, and eradication opportunities.

In OT, response prioritization is dependent on detection. You can’t defend what you can’t see, but you also can’t do damage control if you don’t know the impacts of potential incidents as well as prevention or avoidance measures. In industrial control systems, in particular for energy delivery, there is a significant need to define more relevant risk calculation metrics for detection that reduces the severity of potential incidents.

Does OT/ICS require its own ‘integrated response’? 

Even when targeting intermediary Windows and Linux systems between corporate and control networks, many OT incidents potentially have more similarities to emergency management events such as wildfire contagion rather than enterprise risk management. Metrics that serve to reduce the severity of physical impacts should be a key component of integrated response capabilities in OT.

Electricity delivery systems have long been held to reliability and performance standards, including power quality, system and customer outage duration. Similar context for recalculating risk and mitigating cyber events with or without operational disruption are less normalized with available cybersecurity solutions. OT cybersecurity solutions should include real-world impact analysis rather than focusing on CVSS scoring or context-latent risks.

In California, utilities are authorized to turn off power to certain high risk locations during weather events through a scheme called Public Safety Power Shutoff. Presenting metrics to the public for the scheme was a key challenge to essentially prove a future state that was avoided by the shutoff action. In doing so, customers began to accept the potential of a limited shut down as the preferred alternative to a widespread outage.

Cyber incidents that go undetected can result in physical changes to the process. Often the first indication of a failing distribution transformer is its negative impact on power quality at best, and its destruction at worst. Some detection tools are beginning to pair communications and process anomalies in detection methodologies to determine whether intrusion and manipulation or accidental corruption has occurred.

Focusing still on electricity, load variation caused by a customer-side IoT attack could lead to a physical disruption where voltage variation or oscillation would be investigated as a power quality event or a traditional device failure. OT cyber monitoring at generation and distribution sites likely won’t detect initial manipulation of customer-facing IoT devices, but process anomaly detection married to tranditional cybersecurity behavioral analysis would alert early on the physical changes.

*Process anomaly detection example from Nozomi Networks OT Solution*

Another example of a process anomaly is flow as a variable in a process. If the transmission interval time for the variable changes, an alert is raised for a rule recording transmission that deviates from normal, more than 50 milliseconds for instance. Early detection of suspicious process activity should be an indicator for investigation of incidents, allowing for specific OT integrated response before the situation caused physical impacts to the power supply.

The response and dual-understanding in detection coupling traditional communications and process variables provides key context for prevention and containment. Effective detection and response have bidirectional impact, communications and process variables should not be considered in silos for OT cybersecurity. In this way, integrated OT response can more readily treat both the causes and the symptoms of cyber incidents.

About the Authors

Dr Emma Steward

Dr Emma Stewart,  is a subject matter expert in electric grid resilience and cyber security currently consulting independently.  Prior to this role she was the chief scientist at NRECA and led initiatives on cybersecurity, resilience and federally funded R + D for over 900 electric co-ops. Emma has over 20 years experience in power systems, resilience and sensing including as the Associate Program Leader for Defense Infrastructure at Lawrence Livermore National Laboratory where she led a research team in providing assured power and restoration solutions for critical facilities and advanced sensing and measurement for resilience and cyber applications. Emma also led teams at LBNL, and DNV GL in distribution grid analysis and analytics.  She received her PhD in Electrical Engineering in 2009, and Masters in Electrical and Mechanical Engineering in 2005, both from University of Strathclyde in Scotland. Emma developed the first distribution-synchrophasor network within the US grid, with the data currently in use in many R + D activities throughout the national lab and industrial network.  


Danielle Jablanski is an OT Cybersecurity Strategist at Nozomi Networks, responsible for researching global cybersecurity topics and promoting OT and ICS cybersecurity awareness throughout the industry. She leads several Nozomi Networks engagements with the OT Cybersecurity Coalition, CISA’s Joint Cyber Defense Collaborative for industrial control systems, the ISA Global Cyber Alliance, the Electricity Information Sharing and Analysis Center, and more. Jablanski also serves as a non-resident research fellow at the Atlantic Council Cyber Statecraft Initiative, focusing on operational technology policy, prioritization, and workforce development issues.

Daniella Jablanski

Jablanski serves as a staff and advisory board member of the nonprofit organization Building Cyber Security, leading cyber-physical standards development, education, and certifications to advance the physical security, safety, and privacy in public and private sectors. Danielle previously served as the President of the North Texas Section of the International Society of Automation (ISA). She has previously worked as an ICS intrusion detection market analyst, and a cyber program manager for the Cyber Policy Center at Stanford University. She holds a master’s degree in international security from the University of Denver, and a bachelor’s degree in political science from the University of Missouri-Columbia.

a man standing next to a monitor

Sense smart meter software gives utilities a real-time look at the grid edge

Sense software embedded in smart meters can help utilities get a better look at the grid edge, as CEO Mike Phillips explains at DTECH.
a G&W Electric Viper-ST recloser

G&W Electric shows off next-gen recloser amongst transmission and distribution solutions at DTECH

G&W Electric is presenting a variety of transmission and distribution solutions at DTECH, including its next-generation Viper-ST recloser.