DIKW: Data, Information, Knowledge, Wisdom

Data-Tunnel.jpg

Here's the thing…data is useless. Now, given what we do—or are at least perceived by the world at large to do—I should probably qualify that, huh? Honestly, though, I think the statement can stand on its own. While data seems like it's useful, it's trash, and this fact causes me no end of angst. We're constantly referred to as "data providers", even by members of our own team and marketing collateral, but in actuality we do not provide data.

We don't provide data because it's useless, meaningless, without value. Data is a collection of unrelated facts, the "product of observation", with no meaning beyond its own existence. At their most basic levels, our products and services provide information, and go up from there. We're information providers, and—much more importantly—knowledge providers. If you're a data-geek, those distinctions and the DIKW "knowledge hierarchy" concepts probably aren't new and the next bits are going to seem a bit "Applied Information Science 101" to you—go on and bail, my feelings won't be hurt. If you're not a data-geek, but interested enough in information science for business or other reasons to be reading this blog, it's probably a good distinction to start making. Note: if you are a data-geek and you haven't read Ackoff's "From Data to Wisdom"—go read that instead of this.

Data:

  • I have three Things.
  • One of the Things is reddish-brown, two Things are grey.
  • One Thing weighs about two tons, one Thing about 10lbs., and one about 20g.
  • One Thing has a trunk, one Thing has a tail, and one Thing has a publicist. Yep, a publicist.

No fair drawing correlations and/or conclusions yet! "Data" is exactly what you see there in that pile. We now have some facts (and even that's an assumption at this point) about three "Things". That's it, no more, no less. That's data. See what I mean? Useless. So if data is useless, what the hell are we doing? Well, if you take data and apply some processes to clean it, standardize it, and create some relationships between its constituent bits and pieces, you get information.

Information:

ID COLOR WEIGHT OTHER
1 reddish-brown 10lbs. has a publicist
2 grey 20g has a tail
3 grey 2 tons has a trunk

I'd argue that this stuff—information—is only mildly less useless than data, but it's a start. It's organized and has at least the potential(!) for allowing us to manufacture knowledge from it. It's important only because if you get this part wrong, then any derivative knowledge is also suspect. Truthfully though, unless you know what you're doing this stuff is almost more dangerous than raw data (more on that in a minute). The only reason we provide it in this form is because some of our customers have the desire and acumen to manufacture their own knowledge, and just want to make certain that they have the very best raw materials for doing so, and advice on the best way to go about the process.

But that begs the question, what is knowledge? Basically, you take your set of information and apply a cognitive process to it, one which actually draws correlations and conclusions, hypothesizes causal relationships, etc. This is done using a variety of mechanisms, which all boil down to human analysis. Algorithms, models, simulations—at the end of the day it's just what some human being or a group thereof decided would be a useful way to process information into knowledge, signal from noise.

Knowledge:

ID COLOR WEIGHT OTHER RECORD_TYPE
1 reddish-brown 10lbs. has a publicist tabby cat
2 grey 20g has a tail mouse
3 grey 2 tons has a trunk elephant

Well, that's much more useful! It tells us what each of the entity-instances (records) is, and some of their attributes (fields). Feeling warm and fuzzy, now? Here's the punchline: this last table, the one describing the knowledge we rendered from the information, which was in turn cobbled together from the data…as described herein, it has the potential to be both incorrect and incomplete. Remember the old adage about "it's not what you don't know that messes you up, it's what you know that isn't so?" I'm paraphrasing, of course.

So how could our example be wrong? In the knowledge set we drew the, not unnatural, conclusion that #3 was an elephant. What if it's a Chrysler 300? That fits the available information (grey/2 tons/trunk). It could be something else altogether, though. How might our example be incomplete? In #1 we correctly assessed the "Thing" to be a tabby cat, but failed to differentiate it as Morris the Cat (ergo, the publicist)—a fairly important piece of knowledge, and a conclusion that might have realistically been drawn by a sophisticated enough model. Now take it up a step, to the information. What if the aggregation process failed and the #2 record has the trunk, #3 the tail? Well the probability that #3 is, in fact, an elephant just increased. But maybe #2 is actually Stuart Little, or Fievel. I mean, how many other mice do you know with trunks?

Which brings us to wisdom. Wisdom is basically a local phenomenon—strangely topical given that the focus of recent conversations in the RE.net seems to be revolving heavily around localism as the most significant agent/broker value proposition. I've heard it phrased as "local knowledge". Not to belabor the semantics, but I feel the phrase "local wisdom" is more applicable.

I mean, we have knowledge. From evaluating the information Onboard organizes from the data that we aggregate, I "know" that the schools are "great" in an area, and maybe I can therefore help home-buying parents find a starter home. The local agent, though, can tell them that the HOA for the home they're looking at just voted in a real PITA who hates kids and doesn't let them ride their bikes without sign off in triplicate, and that speeding seems to be a problem. You probably won't find that in our databases. Yet :-).

For the purists, I know I skipped Ackoff's "Understanding" layer—formally defined as the "appreciation of why", as opposed to "who, what, where, when, and how", and nested between knowledge and wisdom. This is by design. First, the common wisdom (loaded word in this context) in information science circles seems to be to steer away from some of the more…metaphysical aspects connoted by his treatment of the subject. Second, if you look at my treatment of the "Knowledge" layer you'll see that I tend to combine in the one layer both the deterministic processes defined by Ackoff's version and the probabilistic/interpolative processes he espouses for his "Understanding" layer. I don't really see the benefit in a separation, and actually feel that the cognitive processes involved are complimentary enough to warrant combination as a matter of course. And if he doesn't like it, he can just come find me, huh? Battle Royale!

What's the point of all this? The point is that data is useless, information is only as good as the systems and assumptions used to process it, and the quality of a knowledge set is a factor of both its constituent information and the cognitive processes used to manufacture that knowledge. Ultimately, the way you determine whether a "data provider" is worth a damn is by looking at the people who make up the team which aggregates and organizes that data into information, and whose grey matter and diligence is responsible for transforming that refined information into useful knowledge.

Final note: this shouldn't be construed as the only legitimate treatment of knowledge management, or as a comprehensive description of our thought processes at Onboard. In many ways this methodology is limited, and doesn't—at least not intuitively—take into account the dynamism inherent in knowledge of any significantly useful complexity. My intention was to use this as an introduction into the amount and nature of thought that goes into creating knowledge, and identify the sharp difference between a product and "data". Data may be fungible, but knowledge…is…not. And, knowledge-wise? I'll put our team up against anyone else's.

As for wisdom? It's probably overrated, and almost certainly to remain a uniquely ineffable human endeavor. We're working on it though. This'll have to do for now:

Information is not knowledge Knowledge is not wisdom Wisdom is not truth Truth is not beauty Beauty is not love Love is not music Music is THE BEST... Wisdom is the domain of the Wis

- lyrics from Frank Zappa's rock opera, "Joe's Garage", Act III, Scene XVI

Image Credit: Luckey_Sun on Flickr.com