friso.lol

I'm being serious here.

Use Data the Hard Way

Posted at —

TL;DR

  • You should not use data without being absolutely sure about the correct interpretation of said data.
  • Probability tells you how likely it is that a bad thing will happen given a large enough sample (how often it will happen given enough time).
  • Maximum downside potential is the worst damage that the bad thing can do.
  • Risk is the probability multiplied by the average downside potential. Risk is an average.
  • You make decision based on risk. You have insurance against maximum downside potential. Without the latter, doing the former will at some point mean game over.
  • People will more readily accept false conclusions when they are visualised in a dashboard.

Prelude

"Amazon reported a 30% increase in revenue in the first quarter they were using product recommendations, so we need to build a recommendation engine.", said very e-commerce executive in 2011.

I know this because at the time I was in the business of selling custom recommender systems and we used this line in our sales pitch. In reality, besides the delusion that replicating Amazon's growth is easily achieved by deploying a single piece of software, we have absolutely no realistic way of fully understanding the composition of Amazon's revenue, let alone how exactly it is influenced by their product recommendation systems. We just highlight a numerical factoid to create an illusion of evidence based or even data driven decision making.

This happens a lot. In a variety of ways. Here are three common mistakes that people consistently and deliberately make to appear well informed or even data driven.

Using Numbers Without Understanding Their Correct Interpretation

The weather forecast for tomorrow shows a 50% chance of rain in The Netherlands. What does that mean?

  1. At any location in the country, there is a 50% chance that it will rain tomorrow.
  2. It will rain in 50% of the locations in the country tomorrow; pure chance determines which areas will and which will not experience rain.
  3. There is a 50% chance that it will rain somewhere in the country tomorrow.

Hardly anyone you ask will know the correct answer, assuming it is even there (I will leave this as an exercise to the reader). Yet this one number influences the majority of peoples' decisions for weekend plans. Including those of the ones who are shocked when it rains in spite of the weather report forecasting only 20% chance of this happening.

Back in 2011 many were shocked that their revenue did not skyrocket like Amazon's in spite of implementing product recommendations.

Confusing Probability, Risk, and Downside Potential

When someone tells you: "the odds of getting hit by a bus are smaller than the odds of winning the lottery", do you stop looking out for buses when you cross the street? Probably not, if you care about staying alive.

Now assume you run a small business that employs a team of five highly educated, tech-savvy professionals. The odds that one of them would interact with a phishing attempt is only 1 in 100000. So there is a 0.005% chance that someone in your business will click the wrong kind of link or open the wrong kind of attachment. Do you worry about protecting your business against ransomware?

For every 20000 businesses that did not consider this a priority, there is 1 that has learned it should have considered otherwise. Out of sheer politeness, we tend to call those instances unlucky instead of ignorant.

The Dashboard Availability Heuristic

Here is a fun fact: software teams that do not track story points have fewer productivity issues than teams that do. Indeed, the availability of a plateauing burn down chart invariably leads to identification of possibly non-existent productivity issues. In extreme cases businesses will even hire for a dedicated role with the sole responsibility of changing the development process into one that produces exclusively linearly declining burn down charts (as they should be). Without proper assessment of whether that is a high priority initiative.

The Availability Heuristic is what makes us jump to conclusions when we have examples immediately available top of mind. For example, the existence of spectacular news stories can lead us to believe that shark attacks are a big problem in Australia (in reality, a lot more people drown without the assistance of a marine predator).

The Dashboard Availability Heuristic is a special case of this phenomenon where the progression of some metric is top of mind due to the ubiquitous availability of a dashboard that tracks it. This leads to applying more weight to initiatives that directly influence the dashboard forgoing others that might actually be more important.

More software teams are aware of their burn down rate than of their commercial contributions in much the same way more people are aware of the potential presence of sharks in some waters than of the strong currents that are in fact more dangerous.

The Hard Way?

Then how do we use data (or dashboards) at our disposal? Well, the hard way of doing this is not necessarily a lot more difficult, but it is definitely more work. Here are the official friso.lol rules of engagement.

On the Interpretation of Numbers

When you are unsure what the correct interpretation of a number or metric is, you can not use it for decision making unless you form a complete picture of the methodology of both collecting and reporting that number.

The correct interpretation of collected data is often vital. This is why sailors, airline pilots, and farmers use different weather forecasts. That way they understand the interpretation of the numbers. Yet, the majority of online businesses will not be able to describe their internal definition of a unique website visitor, including the methodology of collection and reporting.

For each number that you actively use in reporting, investigate the correct interpretation.

On Probability, Risk, and Downside Potential

Probability tells you how often something bad will happen given a large enough sample. How large that sample is or needs to be depends on a lot of things.

Downside potential is the worst possible impact when the bad thing happens. In investment, this is called maximum draw down. I say "when", because invariably the bad thing will happen; though it might not happen to you.

Risk is the probability of the bad thing happening multiplied by the average downside potential of the bad thing across all cases. Risk applies to populations and processes. Not to individual decisions and outcomes.

You take risk when you make many decisions with some probability of a draw down. You can sustain each individual draw down and for the combined decisions, you gain more than you lose. If you can not sustain the individual draw down of a poor decision or you are only considering one sample, risk is not the factor you are looking for.

An example of this is when you programmatically buy online advertising and sometimes end up paying more than your margin on the conversion. That is a risk you could be willing to take.

You consider downside potential or maximum draw down of any decision you make or any probability that you are exposed to, regardless of sample. In cases where you can not sustain the maximum draw down (i.e. it would put you out of business), you look for ways to mitigate this. Probability of the draw down occurring is not a factor.

You protect yourself against ransomware in such a way that as a business you will survive it, regardless of how improbable this is. The probability only tells you how often it will happen, so you can use this to determine what kind of mitigation you justify.

On Dashboard Availability

You understand the causality between your metrics and your business outcomes. Either through evident cause-effect relationships or through inference.

That means that if you care about productivity and you happen to have a burn down chart on your hands, you first establish that quicker burn down actually yields higher productivity in your specific context. If that fails, there is no heuristic or no problem. Or neither.