Metadata, So Mom Can Understand
Republished, by popular demand
Dear Mom, I’m glad you enjoyed my last letter explaining what data is and how people in my industry make a living managing it. After that letter, you confidently answered all data-related questions your knitting-circle friends could throw at you. But then Edward Snowden, former NSA contractor and world-renowned whistle-blower, came on the scene. Suddenly mainstream news anchors are talking about metadata.
I got your panicked voicemail and, as promised, I’m going to try to clarify what metadata is and how it relates to data.
As you recall, data is simply a thing you might want to remember that’s saved somewhere for future use. Well, I can’t promise metadata is as easy to understand as data. It’s an abstract concept, and you’re not alone in being confused by it. Most business leaders that use metadata on a daily basis have no clue what it is – or that they’re even using it!
At its most basic level, metadata is something that helps to better describe the data you’re trying to remember. As an example, consider when you and Dad download your latest “Game of Thrones” episode from Amazon.com. The actual video of the episode that you watch is the “data” since it’s what you want to access and enjoy. But Amazon provides a wide variety of supporting information to help you better search for and confirm this is definitely the show you want to purchase. This addition information is the metadata used to describe the episode, such as:
- Season: 1
- Episode Number: 7
- Episode name: “You Win or You Die”
- Original Air Date: May 22, 2011
- Runtime: 58 minutes
- Director: Daniel Minahan
- Episode description: “Ned confronts Cersei about her secrets; Jon takes his Night’s Watch vows; Drogo promises to lead the Dorthraki to King’s Landing.”
So what’s the value of metadata? Well, you don’t need the metadata to enjoy watching your show, but this additional information helps you feel confident you’re purchasing the actual show you want to watch.
But that Amazon example is a consumer concept where the “data” of interest happens to be a television show – which can be confusing to translate into a business context. In my letter explaining data, I told you that my address was an example of data that is important to most companies. Examples of metadata that describes my address data include:
- Address type: Work (versus Home, for example)
- Residential/Commercial: Commercial
- Latitude/Longitude: 37.504297/-122.219733 (Often used by GPS and other mapping tools like Google Maps to link postal addresses to a physical location on a map)
Believe it or not there is often a lot more metadata associated with this simple address data, but much of it gets very technical. Hopefully this helps illustrate how metadata provides additional descriptive information about my address that not’s explicitly in the address itself.
Let’s wrap this up by explaining the NSA/Verizon/metadata hoopla.
From what I’ve picked up from my primary sources of news and current events (“The Daily Show” and “The Colbert Report” of course), The Guardian newspaper broke the story about how the NSA is collecting phone records from Verizon customers. That story stated:
“Under the terms of the blanket order, the numbers of both parties on a call are handed over {to the NSA} as is location data, call duration, unique identifiers, and the time and duration of all calls. The contents of the conversation itself are not covered.”… “The information is classed as “metadata“, or transactional information, rather than communications, and so does not require individual warrants to access.”
The big news wasn’t that the NSA is collecting the audio or transcript of the phone conversations themselves – I don’t believe Snowden accused them of doing that. Instead, they are collecting the information that describes the phone conversation: the location, call duration, time of call, etc – aka the metadata. Many have expressed fear and outrage over this incident due to the privacy implications and fears of our government overstepping important boundaries, and both sides of this conversation are very interesting – but that’s not the point of this letter.
I’ll end this letter with a fun fact (okay, “fun” is extremely subjective). Just as a garage sale aficionado will preach “one person’s trash is another person’s treasure” – one person’s metadata is often another person’s data. For example, while your target data on Amazon.com was the episode of “Game of Thrones”, a huge Daniel Minahan fan might be using Amazon’s large inventory of television and movies to create a list of all videos directed by him. In this case, Daniel Minahan is the data that’s important to them, not just metadata explaining who directed your episode.
I hope this helps you in your knitting circle conversations. The kids are looking forward to their matching scarves!
Love, Rob
PS Seriously, why did Dad buy a 3-D printer?? I have absolutely no idea how to fix one!
PPS Dad says you’ve been reading a lot of Thomas Friedman – I’ll take a crack at Big Data in my next letter.