analytics data-science management product-management

25 Sapient Principles for Better Data Literacy

Let’s define data literacy as the ability to analyze, understand, and communicate data. If it hasn’t been hammered into our heads yet, Data Literacy is a critical component to data-driven decision-making.

 

source: xkcd

The journey to transform your product team’s analytics is a long one. We’re finally at a place where the data can speak to us and we can make decisions based on what it tells us. Inherently, this is why literacy can create or destroy value in an organization. The cost of misinterpretation can be significant.

 

In this post, we’ll discuss how to improve your data literacy through:

  • principles
  • training and practice

As we go through the sections, I’ll share and dissect illiteracies that I’ve recently seen.

 

Becoming data fluent doesn’t happen overnight. It takes discipline, awareness, and collaboration to hone.

Data Literacy Principles

If you think about it, becoming data fluent is much like learning a new language. There’s a lot to learn — new vocabulary, colloquialisms, contexts, and media. Here are a few principles to get you into the right state of mind.

Building in awareness

  • Know your sources. Not all sources are the same. Know what the intrinsic biases of the source are.
  • Know the limitations of your data. Sometimes you only have a small historical viewpoint. Or, the usage data doesn’t cover all use cases.
  • Be realistic about your confidence. Your data will never be perfect, so don’t expect it to be. Imperfection is ok — you just need to communicate it.
  • Make sure the data is relevant. There’s an overabundance of data. Don’t add to the noise. It can distract or even derail important decisions.

While interpreting data

  • Be aware of blind spots. Blind spots manifest in one of these forms: (1) we know what we don’t know and (2) we don’t know what we don’t know.
  • Get other perspectives. Trust your instincts, but also be open to others’ interpretations of the data.
  • Explore alternative explanations. There are many ways to interpret readouts, so stretch your lateral thinking here. One of these explanations will be closest to the truth.
  • Keep it simple. Wisdom states that the simplest explanation is best. See Occam’s Razor and Kolmogorov Complexity.

While communicating your insights

  • Tell a story. In my last post, I talk about the role of storytelling to compel decision-making. This is a good way to make your message stick.
  • Make it relevant. Same point as above, but it’s worth repeating.
  • Focus on insight, not just data points. It’s easy to regurgitate “facts” and tidbits. It’s much harder to distill data into lessons and actions.
  • Be transparent. One accountability of being data fluent is intellectual honesty. Communicate the shortcomings, risks, and considerations.
  • Break it down. As you get more fluent, your underlying analyses can also get more complex. Make sure you can communicate the methodology.

Expectations of the organization

  • Invest in training. You’ve invested in the data and tools already. Follow through. Training with limited resources doesn’t scale well.
  • Reward data literacy. The optics matter if you want data literacy to be a capability. Make sure to recognize people who show fluency.
  • Align training to important goals. I got this from Towards Data Science post on “3 Must for Building Data Literacy”.
  • It starts at the top. You need leadership to demonstrate dedication and data literacy. It’s very disappointing a leader is giving blank stares and doesn’t understand what’s going on.

I’ll take “Fallacious” for $800, please. (That’s a Jeopardy! reference)

Imagine that you’re a senior Product executive. You read in a news headline that Microsoft Teams has more users than its competitor Slack (20 million vs 13 million). Your peer turns to you and says, “I think Slack is still winning.” Curious, you ask for her explanation.

 

She looks at you and says, “Their CEO said that they have over 50 clients with $1M ARR. Over 70% of those are [Microsoft] Office365 clients.” Full stop.

 

Looking for clarification, you ask for her definition of winning. She says, “Well, they’re getting revenue from large Office365 clients so they have to be winning.” Full stop.

 

You reason, “That doesn’t mean they’re winning. Especially if you consider how much larger Office365 revenue is compared to Slack’s. We don’t know anything else besides the Active Users for both of them.”

 

She confidently responds, “Well, their CEO said…” And then you tune out.

What happened there? Well, a few things.

  1. Pure regurgitation. “The CEO said this.”
  2. Fallacy or a lack of logic. “Slack is also being bought by Office 365 customers.”
  3. Lack of depth. No opinion when asked for a definition for “winning”.
  4. Lack of curiosity. No interest in exploring other perspectives or measurements.

 

Train and Practice for Data Literacy

As the cliche goes, “purposeful practice makes perfect”. That also applies to data literacy.

Four principles for literacy training

  1. The best training is contextually relevant. As you create a training curriculum, pick an important project. List out key questions you need to answer. The analytical tools will all be the same — statistics, algebra, regressions, decision trees, and so on. Applying them to a relevant concern will make the lessons stick.
  2. Start with the fundamentals. That said, there are baseline tools that everyone needs to be aware of. These include aggregations, distributions, basic stats, and probabilities. Qlik offers a good set of Data Literacy Courses — from basics to more advanced techniques.
  3. Be comfortable with the ambiguity. Calculating a number is easy. It’s much more challenging to know what to do with it. (“Okay, so now you know Olive Garden bakes 700 million breadsticks annually. Now what?”)
  4. Don’t get cocky. Some people will learn one thing and then act like they know everything. It’s very easy to be humbled in this line of work.

Four principles for literacy practice

  1. Find projects to practice. Find a relevant set of projects that you can apply the analytics tools the team has learned. Reason with data where possible. Challenge your assumptions and the work of others. That’s healthy.
  2. Find a mentor. This should be someone with more data fluency experience. These people may be in “junior” positions relative to you but may have much more exposure to data. Ask for their interpretation of the data and present your case. Build a sounding board of similar people for greater effect.
  3. Present it publicly. This is a tough one because it can be nerve-wracking. But, this is the best way to field test your skillsets. Your growth comes from the ability to answer unexpected questions and looking at things differently. You will also practice changing the perspectives of others.
  4. Evolve continuously. Incorporate different data sources. Blend data sources. As you get more comfortable with your existing domains, you can expand your scope. Find different ways to present and visualize your insights. Learn new techniques.

I’ll take “Erroneous” for $2000, please.

(In this example, all numbers are based on real occurrences. Names are fictional.)

 

Imagine that Jimmy is the PM for Product X. CEO Kim asks Jimmy if the usage of Product Y has any impact on the usage of Product X. Jimmy crunches some numbers and arrives at the chart below.

 

 

After seeing the 11.3% difference, Jimmy excitedly reports back to the CEO. “Kim, people that use both products are 11.3% more likely to click in Product X.” CEO Kim pats Jimmy on the back and tells the company about this wonderful discovery.

 

Let’s pause right here. We can count two prominent issues with Jimmy’s analysis.

  1. There’s not much rationale. Sure, the numbers state an 11.3% bump but we don’t know what’s contributing to that increase. More on this later.
  2. That’s not a likelihood. It’s a ratio. It’s false to say that 11.3% is an increase in the probability that a user will click. It’s also false to say that there’s an 11.3% chance that they’ll click more. There’s 11.3% more activity in aggregate, not probabilistically.

Now, back to the story… You, the data fluent PM, are curious about these numbers. Specifically, you want to know what is driving these numbers. You notice that these metrics are averages across all features of Product X.

 

If you look under the hood, there might be an explanation of which feature(s) are contributing to the bump. You cut the data and arrive at the following chart.

 

 

It looks like View Home Page has a nice little bump of 5%. That’s doesn’t explain half of the 11.3%, though.

 

Ah, but look at the Search feature. There’s a whopping 46% increase in Searches by users of X+Y. This is a pretty significant finding.

 

You’re about to share this with Jimmy and Kim but then you realize something: that same Search feature is also used in Product Y. Searches from Product Y might count as searches in Product X. There’s a chance that the counts are wrong.

 

The whole story changes. Product Y may not impact Product X usage as Jimmy thought. There may not even be a relationship.

 

Good luck untangling that with the rest of the org.

I wish this example with Jimmy wasn’t true. Jimmy was a Sr. PM Manager in real life. Kim did tell thousands of employees. So where did it fall apart?

  • Not exploring the “why” or the drivers behind the bump. The original question asked about causality. Jimmy was nowhere near answering that question. There was even a chance that Product Y didn’t have an impact on Product X usage at all.
  • Not telling the story. See the last point.
  • Not being aware of the data’s blind spots. A critical insight was only one cut of data away from Jimmy’s first analysis. Knowing that the same Search function was used across both products surfaced a blind spot.

There are a few critiques you could make about the analysis, but you get the picture. I’d like to think that Jimmy is a self-aware guy and has long since corrected his literacy issues.

 

We’ve covered many different principles and considerations as you evolve your data literacy. Evolve is not a word used lightly. There are so many pressures that can push your fluency to grow if you embrace them.

 

All the examples above are real. They are not meant to detract or discourage. Rather, they’re meant to make you aware of pitfalls.

 

There’s a significant responsibility of being data fluent. That is to be accountable to reality and to be intellectually honest so the organization can make the best possible decision.