Skip to content

Learn about our organization's purpose, values, and history that define who we are and how we make a difference.

Who we are

why-we-are

Discover how the Mastech InfoTrellis ecosystem is enabling customers to make well-informed decisions faster than ever and how we stand apart in the industry.

Delve into our wealth of insights, research, and expertise across various resources, and uncover our unique perspectives.

Thrive in a supportive and inclusive work environment, explore diverse career options, grow your skills, and be a part of our mission to excellence.

Table of Content

The Cowboy Problem

“I cannot live without brain-work. What else is there to live for?” – Sherlock Holmes

I saw this challenge (minus the solution) being passed around in a WhatsApp message, and after solving it, it struck me as a fantastic way to illustrate the difference between data, information, knowledge, and insights, under the pretext of not getting shot.

Let’s get started, shall we?

The Problem

Figure. 4 Cowboys buried in the ground, with a wall between them, facing a sadistic firing squad.

Data Science Cowboy Brick wall
  • Four men are buried up to their necks in the ground, as illustrated.
  • They cannot move, so they can only look forward.
  • Between A and B is a brick wall that prevents either side from seeing the other.

They all know that between them, they are wearing 4 hats, 2 black and 2 white. None of them know what color they themselves are wearing, and they only know what the person (or people) in front of them (in their field of view) is wearing.

In order to avoid being shot, one of them must call out [to the executioner] the color of their own hat. If they get it wrong, he/she instantly shoots all of them. They are not allowed to talk to one another, and have 10 minutes to ponder the problem.

We are told, after 1 minute, one of them calls out.

The Question

“It is a capital mistake to theorize before one has data. Insensibly, one begins to twist facts to suit theories, instead of theories to suit facts.” – Sherlock Holmes

  • Which one calls out the color of their hat?
  • Why is he 100% certain of the color of his hat?

I will give you 1 minute to solve the problem, before you scroll down and see the solution revealed in detail. Why 1 minute? Well, clearly, one of the cowboys above was able to solve the problem in 1 minute, and he saved them all from getting shot. That said, it’s alright if you don’t get it in the first minute, you still have about 9 minutes left, before your companions are all shot with you.
.
.
.

This is time passing… tic toc.
.
.
.
1 minute later…

The Solution

“Hurry, Watson! The game is afoot.” – Sherlock Holmes

I will now walk you through each of the steps in solving this problem, being somewhat elaborate in my explanation, and while doing so, I hope it will resonate with my earlier diagram illustrating the path from Data to Insight to Action.

Data

“Data! Data! Data! I can’t make bricks without clay.” – Sherlock Holmes

First, let’s consider the data we are given.

  • We have 4 cowboys (yeehaw!), each wearing 1 hat, either white or black.
  • They are buried in 4 spots, unable to move, or rotate their heads, constraining their field of view forward.
  • The wall prevents A from seeing B, C, D, and vice-versa.

That constitutes the data and the physical constraints of the problem.

Information

“You see, but you do not observe. The distinction is clear.” – Sherlock Holmes

Information is a higher-order aspect of the data, distinctly different from noise, so to speak. There is a whole field of information theory, developed by Claude Shannon, that deals with the coding of messages. This has tremendous importance to the field of machine learning, as the field is rife with references to concepts from this area.

So, what information do we have here, that is inferred from the problem statement?

  • A and B can only see the wall. They cannot turn their heads to look behind them.
  • Immobile and buried in a straight line, C sees B, and D sees C and B; no one sees A.
  • No one knows their own hat color.
  • Each cowboy knows the others’ position.

Figure. What everyone “sees”

Data Science The Cowboy Brick Wall

Knowledge

“Crime is common. Logic is rare. Therefore, it is upon the logic rather than upon the crime that you should dwell.” –Sherlock Holmes

Next, we consider what everyone “knows”. Knowledge has two aspects here: (a) the system level knowledge, and (b) the local aspects of the problem. Knowledge is yet a higher order refinement of Information. Information is piece-wise. Knowledge is systemic. In formal terms, it is the set of facts and rules of the system taken as a whole. The laws of the game, so to speak.

So, what do we “know”?

  • The 4 companions get 1 shot to get this right, if not, they all get shot.
  • None of them will call out early, unless they are 100% certain of their hat color, because they all value their lives. Granted, they might do a 9th minute “Hail Mary” [see cultural football reference] guess, but luckily, one of their companions yelled out the right answer in under a minute. So, we know he wasn’t “guessing”.
  • We also know that A and B don’t matter, because they are equally ignorant of one another, as well as everyone else, due to their positions.
  • Looking at the options, comprehensively, we see that only a select few of them fit with our observations.
    • 4 slots [1,2,3,4]
    • 2 options each {B, W}
    • only 2 of each (2Bs, 2Ws)

i.e., So, arranging them by position, know that one of these must be true…

Data Science The Cowboy Arrangement

We can, of course, deduce that the last three are not possible, as both C and D know B’s hat is white (W).

  • We might also reason that of all the 4 cowboys, D has the most directly observable data, since he can see everyone else (except A), so he’d be a prime candidate to be the first to call out. So, C has to choose between 3 options, given that he has less information than D by this point. D can eliminate one additional option since he can see C’s hat color, leaving him to ponder two possibilities.

Figure. What A, B, C, and D can reasonably “know” given what they can observe (data).

Data Science The Cowboy Wall Strategy

So, are you guessing D? How do you know for sure? It is certainly a good guess, but we’re not guessing. Remember, we need to be certain. If you did guess, you probably got everyone shot. Good job. Hashtag: #WTF.

Insight

“The world is full of obvious things which nobody by any chance ever observes.” –Sherlock Holmes

So, we can’t guess and we can’t call out, because we don’t have certainty. How do we get certain? Is there anything in the problem that offers a clue to this conundrum? Oh wait, did I say ‘certainty’? Hmmm. Let’s hold that thought.

Well…

  • We know that D has the most data, but surprisingly, he doesn’t call out for a good minute, despite being able to see both B and C.
  • C notices that D doesn’t call out, despite D having more direct observable data.
  • C quickly reasons that D doesn’t know, despite his vantage point. (A key point, C will now exploit!)
  • So, we can now intuit that C has more (indirect) information than D as we pass the 1-minute mark, given D’s response (or lack of).
  • C can exploit the fact that, among the 2 remaining scenarios D is contemplating, both scenarios constrain C’s hat color (Black).

Silence speaks volumes, they say.

“…when you have eliminated the impossible, whatever remains, however improbable, must be the truth…” –Sherlock Holmes

Strategy

“Having gathered these facts, Watson, I smoked several pipes over them, trying to separate those which were crucial from others which were merely incidental.” –Sherlock Holmes

C examined all the data, assessed all the possibilities, and reasoned out that there were only 3 viable options, but he would need an additional bit of information, which he could only glean by observing the responses of the others to this situation. So, he hatched a scheme to determine exactly what each knew to be true, and what this implied.

Figure. C decides to look for D’s response to help him determine what D’s level of certainty might be. If D is 100% certain, there is only 1 way he could be that certain. On the other hand, if he was uncertain, there are still two possibilities, so it would be a 50:50 guess. C knows D won’t guess.

Data Science The Cowboy Problem

C reasoned that he would sort through these options, by using D’s response (or lack thereof) to determine D’s level of certainty. If D had enough information, he would have responded with certainty; however, D’s silence told C that 1 of the 3 options was impossible, leaving 2 options for which D had only a 50:50 chance of choosing correctly. This uncertainty meant that D wouldn’t risk guessing. However, despite D’s uncertainty in the face of these two options, C knows that in the case where D is uncertain, C is wearing a black hat in both cases. So, D not knowing his hat color doesn’t matter, because his uncertainty reveals to C that C’s own hat color could not have been the same as Bs hat color (W). Therefore, C reasons that he must have on a Black hat.

C’s critical insight was to understand that of these 2 uncertain options, they shared 1 important feature, i.e., the color of C’s hat was black in both cases where D was uncertain of his own hat color: [B W B W] or [W W B B].

Action

“My dear Watson, you were born to be a man of action. Your instinct is always to do something energetic.” –Sherlock Holmes

Here, I am lumping C’s tactics into “Action” since we’re dealing with a fairly straightforward scenario, and not a complex process. Ultimately, no matter how good your strategy is, often execution is 50% of the battle, because in a world where everyone has approximately the same information, the diligent early bird that executes more skillfully gets the proverbial “worm”. Here, meaningful execution requires that the cowboys solve the problem in under 10 minutes, before everyone gets shot.
After giving D a minute to consider his options, C is clued into D’s level of uncertainty, revealing that of the 3 options, only 2 of them were viable at this point; this allows C to call out that his own hat color (Black) with 100% certainty, saving his companions from a grisly end.
So, we can see from this exercise that: (1) Data, Information, Knowledge, and Insight do not mean the same thing; (2) Insights challenge the prevailing model, and often provide hidden context that is essential to unraveling the underlying causal scheme of events; and (3) without the insight, the problem remains unsolvable – the solution was entirely dependent on the “meta” layer of information and knowledge of the system, which was not explicitly stated in the data, but which was suggested by the data. So, merely having a large amount of data does not solve all your problems; you need to think carefully about what the data implies/means. This is where the data scientist must excel as a critical thinker, making use of that expertise/intuition in statistical thinking, knowledge of the system/domain, etc. to arrive at critical insight about the underlying system or process, and then translating that insight into some repeatable, scalable process that the business can exploit to drive revenues, reduce costs, or avoid disasters.

As you reflect on the complexity of translating your data into insights, ask yourself, does your practice think this comprehensively about your business? What is your knowledge strategy? How will you translate data to insights and act upon them in a meaningful way? At MIT, we can help your practice achieve unicorn-status, enabling you to operationalize your data and drive value to your bottom line. I have made my career in identifying similar insights yielding billions of dollars in bottom-line impact for enterprise clients, and we’ve built a data science practice group around my methods in order to systematize this at scale for our clients. If all your conversations are about infrastructure or models, you are definitely missing a trick. Come talk to us.

Knowledge Strategy
As we all continue to lockdown and spend our time at home contemplating the seconds, minutes, and hours of our existence, I am reminded of this quote by my favorite consulting detective:

"My mind is like a racing engine, tearing itself to pieces because it is not connected up with the work for which it was built. My mind rebels at stagnation. Give me problems, give me work, give me the most abstruse cryptogram or the most intricate analysis, and I am in my own proper atmosphere. I can dispense then with artificial stimulants. But I abhor the dull routine of existence. I crave for mental exaltation. That is why I have chosen my own particular profession, or rather created it, for I am the only one in the world." – Sherlock Holmes

That’s all folks! Y’all have a happy virus-free week. And, remember, don’t get shot!

avatar

Prad Upadrashta

Senior Vice President & Chief Data Science Officer (AI)

Prad Upadrashta, as Senior Vice President and Chief Data Science Officer (AI), spearheaded thought leadership and innovation, rebranded and elevated our AI offerings. Prad's role involved crafting a forward-looking, data-driven enterprise AI roadmap underpinned by advanced data intelligence solutions.