This series of posts will explore my perspective on risk as a metric and a measure in software testing. First up: The Math of Risk.
There’s a common misconception when it comes to risk that it cannot be used, measured, or understood, and that if it can be measured, the measurement lacks credibility or clear definition. I have a fundamental disagreement with this assumption, as I feel risk is foundational to testing and software development in general. Every decision we make is rooted in the notion that we want to mitigate the potential impact and likelihood of failure. In other words, every decision we make is rooted in identifying what the risk is and how we mitigate it.
Turning risk into a metric requires that we accept that this is a somewhat subjective or qualitative measure. Despite often demanding that metrics be “hard numbers” rooted in quantitative calculations, we’ve come to accept this kind of measure in other metrics such as pointing. Risk measurement has much in common with pointing, by the way. Both are a relative size measurement based on the intuition of the experts, or put another way, the team.
Numbers Have Meaning
There is an endless list of potential means, or scale, that can be used for measuring risk. You can use emojis, colors, pictures of otters showing increasing levels of panic, it goes on and on. But the only way you can truly turn risk into a metric is to use numbers. I have said in the past, and I’m sure will continue to say in the future—while there are plenty of scales that can be used, there’s only one that truly has context for anyone who looks at it.
Numbers have a weight and definition. There are no assumptions that need to be made. There’s no ramp up to team personality and culture required to understand them. Numbers represent magnitude, so they can be used to compare risks and prioritize risk mitigation. The other benefit we get from metrics besides measurement is the ability to track change over time. And while I think otter pictures are great, I can only measure trends through numbers.
So, there you have it, if you want to use risk as a metric, our scale must be numbers.
While the notion of using numbers is not new, the way I use them is slightly different. Full disclosure—this way of calculating risk I advocate is not my own creation. I learned this while working at Progressive Insurance; this measurement is their standard way of doing business. (I bet you thought insurance couldn’t be innovative, right?) And while I didn’t create this formula or scale, I did help to usher in some changes to the way risk is managed and measured over time. Keep an eye out for a future post where I dig into that further.
The Math of Risk
ISTQB and others advocate for two components that are used together to give the measurement of a specific or identified risk. Their two components are likelihood or probability and impact.
Likelihood or Probability
Plainly stated, the likelihood or probability of a risk is a combination of how often there’s been a defect or failure related to this particular risk or what we estimate the likelihood to be that there will be a defect or failure related to this risk.
The impact related to a particular risk is what we view the impact of a defect or failure related to this risk to be. In my opinion, impact is something we don’t dig deep enough into. Sure, we need to know the user impact if the software fails, and we should consider any technology-related impacts (loss of data, etc.) if there’s a defect or failure, but we really should think deeper and broader when we consider impact.
For many of us, there can be financial, regulatory, or legal impacts of a failure. We should consider the potential impact on the business or the brand identity for a potential failure. Is this a module of our software that, if it fails, we’re likely to see people head to Twitter to complain? Are we potentially going to find ourselves in the crosshairs of a New York Times expose if this feature fails? If so, we need to consider this in our rating for impact.
Traditional Method of Calculating Risk
The traditional method of risk calculation is a 1-3 scale for Likelihood/Probability and a 1-3 scale for Impact, with 3 being the highest and 1 being the lowest. These two components were then multiplied, and there you go, your risk score for that particular risk is ready for you to weigh against others. For example, 3 likelihood and 2 impact would be a risk score of 6 out of a possible 9.
A Better Way to Calculate Risk
I advocate for adding an additional component and broadening the scale for risk.
I find that likelihood/probability and impact are too limiting when considering risk. When we’re building something new, there’s a good chance we don’t fully know what the probability of a failure is yet. It’s possible that we’ve seen a high level of stability in a feature, but despite it being reliable and stable thus far, it’s still complicated and potentially difficult to maintain.
Therefore, I advocate for adding in the additional component of complexity. Complexity allows us to consider the whole of a specific risk, instead of just the factors of impact and likelihood.
Broadening the Scale
When using the 3-1 scale ISTQB and others commonly advocate for, I find myself not having enough room on the scale to fully measure the relative size of individual risks. When using this 3-1 scale, our potential risk scores are 9, 6, 4, 3, 2, 1.
While there are plenty of potential scores on the low end, there’s very few on the high end. This can leave us in a situation where we don’t have enough context to make meaningful decisions based on the risk scores we’re seeing. Put another way, I can’t rely on this the way I want to, to make informed decisions around where to focus my test efforts.
Instead, I suggest using a 5-1 scale for all three components of likelihood, complexity, and impact. But I don’t want my scale to become so broad that I lose context because there’s too much gap between risk scores. So, I instead use a simple evaluation and calculation to make my scale manageable.
Let’s consider a risk called shopping cart for this example. We’ve rated the components of our shopping cart below.
Likelihood – 2
Complexity – 3
Impact – 5
To keep our scale manageable, we want to hold to a 25-1 scale. To do this, we’re going to multiply Impact by Likelihood or Complexity, whichever of the two is higher. This is going to give us a risk score of 15. Since Complexity is a 3 and Likelihood a 2, we’ll multiply Complexity by Impact.
The benefit of this method and larger scale is my new scale is 25, 20, 16, 15, 12, etc. This larger scale enables me to make more informed decisions about this like test coverage based on risk.
If you’re saying to yourself “Great, I know the formula, what do I do now?”—don’t worry, I’m not leaving you in the dark. There’s more to come! In this series I plan to break down the nuts and bolts of each step of risk analysis with my perspective on each. If there’s something specific you’d like to learn related to risk, let me know on Twitter (@TheyWrestleTest), and I’ll try and include it in the series.