Sunday, March 14, 2021

Book memo: Measuring and Managing Information Risk

Cover image for Measuring and Managing Information RiskMeasuring and Managing Information Riskby Jack Jones; Jack FreundPublished by Butterworth-Heinemann, 2014

This book is at the foundation of risk management as taught by the FAIR Institute.  FAIR is "Factor Analysis of Information Risk" and is a framework for evaluating cybersecurity risk.  Open FAIR standards are managed by the Open Group.  FAIR has been mapped into the NIST cybersecurity framework as well.

The recent breach of the Oldsmar, Florida water treatment plant, "Compromise of U.S. Water Treatment Facility", is discussed in a FAIR Institute blog post, that tries to take a "realistic look at the risk, applying the discipline of FAIR thinking".  However you may need to read below to understand that analysis.

It is not possible to condense a book into a few paragraphs in a blog post, but in outline, this is the FAIR ontology model.  The risk analyst tries to get numbers from subject matter experts as input into the model. As you can imagine, getting these with any precision is hard.  The inputs in the model are characterized by probability distributions derived from three parameters for each input - estimates of the minimum, the maximum and the most likely value.  The analyst then uses Monte Carlo to get at estimates of the risk - the minimum, the maximum, the most likely value, and the average annual risk.

The analyst tries to avoid going down the tree if they can.  E.g., on an insider threat,  from privileged insiders, the vulnerability can be assumed to be 100%, there is no need to consider Threat Capability and Difficulty.

With that under your belt, the FAIR blog post can be summarized as saying that the water treatment facility was (apparently) in a state of high Vulnerability, but actually had high Resistance Strength.  The Threat Capability of an probable attacker would be low, and the Probability of Action low.  Or in other words, at least so far, the frequency of attacks on water treatment plants is low. The analyst also thinks that the the significant cost is likely reputation damage (one of the several categories of loss in the FAIR model, not depicted on the diagram).

The main purpose of trying to be quantitative about risk is to be able to prioritize the myriads of things that  management needs to address and find funding to work, and per the analyst, pipes freezing, electric grid outages and "many more events" are the bigger threat for waterworks.

I think the FAIR blog post has several weak points, which illustrate the complexities the analyst has to deal with.
  • The poor security practices (no firewall, same password on all computers, end-of-life software, etc.) were not really mitigated by compensating controls.  From the news-reports, it appears that it was happenstance that a remote supervisor saw the chemical levels were being tampered with.  The experts are quoted as having said "they got lucky", and "equivalent of walking through an unlocked front door."  There is no mention of additional continuous monitoring and alarms or testing to detect high levels of sodium hydroxide in the water.  Luck is not a mitigating control.

    Vulnerability, as FAIR terminology, does not mean "a weakness that can be exploited".  It means "the probability a threat agent's action will result in loss".   From the publicly available accounts, this probability was high.

  • It is not clear that one can deduce the Threat Event Frequency from past attacks.  It all depends on how much you buy argument is that waterworks have been available as a target for a long time, so the fact of no/few attempted attacks is a good indicator of low Threat Event Frequency.  Every category of cyberattack has had a first instance, and in some cases, e.g., ransomware, the subsequent growth in attacks has been spectacular.

  • The analyst in the blog post is analyzing risk from the perspective of the management of the waterworks, e.g.,
  • Management actually would have a defense against claims of negligence by regulators or litigious stakeholders: They had recently completed a risk and resilience review for the EPA and have till the end of 2021 to implement any findings.
The FAIR methodology says that if your primary stakeholder is different, you need to conduct a second analysis. If we take the perspective of the water-consuming public, I don't think the management having legal defenses against claims would be of any comfort to anyone hurt by high sodium hydroxide levels in the water. (The news reports say that death or serious illness was unlikely, more likely was skin irritations and rashes).

FAIR likely provides a good framework within which to reason about and even quantify risk - I'm just learning about all this, and so it is a mildly informed opinion only. 

About black swans - I think the methodology can accommodate rare, large loss events by setting the maximum magnitude of loss appropriately.  Of course, there is the question of whether the Monte Carlo will catch the rare events in the long-tailed probability distribution, perhaps one needs to run the simulation long enough to catch a few maximum magnitude events.

About grey rhinos - highly probable, high impact yet neglected threats, the attack on waterworks might turn out to be one such.   FAIR methodology, if properly used, should make grey rhinos pop out immediately.