Operationalizing Data Lake Privacy Governance for Value Creation

Last Published: May 31, 2022 |
Nathan Turajski
Nathan Turajski

Senior Director, Product Marketing, Data Governance & Privacy

How today’s businesses are gaining new customer insights through analytics while meeting the growing challenge of modern privacy mandates

A global footwear manufacturer was recently in the news over a shoe design controversy that polarized consumers between patriotism and social justice activism. When jumping into the debate, a common conclusion is often, “Hey, they KNOW their audience, they’ll be fine.” But how do consumer-focused companies REALLY know the impact—or do they even care?

While people may hold a variety of opinions as to whether their brand loyalty should be impacted by their personal ethics, most businesses nowadays do tend to know their audience’s preferences, and yes, any potential impact to customer loyalty is factored into marketing decisions. Certainly the benefit of a doubt goes to a premier consumer company with a brilliant track record of market growth as the result of their bold decisions.

But how does any top-notch marketing organization ensure that it not only preserves its core customers’ loyalty with each tough new marketing decision, but also builds better brand value and comes out ahead in the long term? One can suspect they must know their audience and know them well. There are political opinions and then there are consumer spending decisions. If your responsibility is to shareholders and optimizing business value, an enterprise quite often prioritizes the bottom line. Businesses understand who pays the bills, in addition to maintaining ethical standards.

It’s these sorts of examples that are becoming more frequent—where companies have figured out how to not only maintain consumer loyalty, but accelerate it, using the power of data lake analytics. For example, a simple search online for “big data value creation” illustrates the new trend of customer loyalty through innovation, which achieves real-time insights for businesses and allows them to unleash the power of data lakes to offer new products and services. One fun example is Morton’s The Steakhouse, where they not only tracked social media comments in real time but correlated these social media insights to a frequent loyal customer. They were then able to create a truly unique experience by delivering a steak to this customer when his plane flight landed! If big brother is watching, he comes with a porterhouse in hand.

But the Morton’s example does ask a key question, “Is your company even capable of something like this?” The answer is often complicated. Capable, yes. But willing? Not so much. Why is that?

With great power comes great responsibility

While most consumers often assume businesses are using their personal data to further their marketing objectives, and we’ve seen high-profile examples of abuse, the reality is that most companies are very aware of the great privacy challenge ahead. Facebook, Cambridge Analytica, and the fall-out from consumers demanding more rights and business accountability have all factored into companies making more educated risk-based decisions for data exposure across the organization.

Should we take our personal data about consumers out of the hands of data scientists and hand it over to marketing teams, or will it be abused? Is it safe to run data analytics on a multi-tenant public cloud platform, or will access rights create a possible security breach? Will opening up our data for business expose us to liability we haven’t fully thought through yet? These are real risk vs. reward tradeoffs that come with data lake governance to become a business enabler or a liability.

Data risk vs. reward considerations are increasingly executive-board-level decisions, primarily because of the need to balance privacy risks with value creation opportunities that can help outpace the competition and grow market share. Companies are now very aware of their choices to expose data lake assets, especially in light of modern privacy mandates and high-profile breaches, where data losses result in compliance violations, followed by the often irreparable, long-term loss of consumer trust.

Have your cake—or your steak—and eat, it too

Using data lakes to achieve consumer insights through analytics can also have plenty of upside. But this assumes that risk exposure can be governed as personal and sensitive data is used across a global and often complicated organization—with data accessed by many varied stakeholders across many geo locations. With the European General Data Protection Regulation (GDPR) privacy mandate, the California Consumer Privacy Act (CCPA), and many other state and global mandates driving the threat of privacy penalties, businesses are understandably cautious to open up too much potential all at once. However, out of the fear of legal repercussions is coming the technical innovations to handle data more responsibly with improved data lake privacy governance practices.

While the Morton’s example highlights the great precision data lake analytics can potentially offer for understanding consumer behavior, gaining new market insights doesn’t need to be as “personal” as it sounds. Why is that? Quite often, personal data can be effectively “de-identified” to remove sensitive personal info such as a name, address, ID number, to simply look at macro trends about data that are more relevant to the business.

Threading the needle: Personal data doesn’t need to be so “personal”

There is a case to be made that, for most data lake analytic insights at least, marketing teams and others focused on value creation do not need highly personal details. Consider example use cases such as determining how many users in a geographic area might be responsive to a promotion campaign. No names are needed to look at general buying behavior in a region—perhaps you discover that a high-income geography like Beverly Hills is more likely to buy a Mercedes or BMW. (Surprise!) But you get the idea. Amazon has mastered this through their affinity marketing program to build more loyalty, “Customers who viewed this item also viewed…” where no money is left on the table and no personal ID is required to create purchasing recommendations. Instead, it’s about applying behavioral insights and automating the results through machine learning and artificial intelligence to generate greater value.

Perhaps you want to know if thousands of your consumers are downloading the latest new media? Or, how many of your customers with kids in your southwest region visit your stores on any given weekend? Most organizations can simply look at de-identified consumer behavior data, without getting into personal identifiers, and safely gain new actionable insights. How? They are doing this by applying data-centric privacy controls such as dynamic data masking and persistent data masking (what privacy mandates can refer to as anonymization or pseudonymization techniques) that are targeted and persist with the data itself, no matter where that data is used, consumed, or analyzed. This is particularly critical in this age of portability, in public cloud environments where analytics workloads are increasingly migrating to achieve greater performance and scale.

Data de-identification for privacy and business value—better together

New privacy mandates such as the GDPR, CCPA, and others are forcing businesses to rethink their data lake governance approach. Good news: the results are not only great for privacy compliance, they’re great for the bottom line. And increasingly, this evolution to next-generation data management with privacy in mind can occur seamlessly, as seamlessly as data lake users learn to apply data privacy governance best practices to improve data quality and safely gain a 360 view of their consumers. Discovering personal data across an organization, mapping this data to identities for managing consumer consent to ensure that it’s used responsibly, developing a risk-based heat map for data exposure, and surgically protecting the most critical personal attributes that matter with data de-identification makes data safer to consume and distribute, satisfying chief data officers, and security and privacy compliance team agendas.

How do businesses operationalize an effective approach to protecting data lakes?

Whether you’re just starting out on a privacy compliance journey or looking to fill gaps in a data governance strategy, it’s relatively straightforward to bring in the necessary components of a complete solution to close a gap or improve capabilities to lower data risks. This solution includes a framework approach that can enable you to:

  • Use discovery and classification of personal data found in data lake analytics, and mapped to identities (users and systems), to help understand and remediate risk exposure
  • Apply data workflow tools for policy definition and data movement clarity that help develop clear ownership and trust in data operations with privacy in mind
  • Manage consumer consent to use their personal data and enforce rights for appropriate use with a 360-degree view of relationships
  • Protect data privacy with fine-grained, data-centric controls that secure sensitive data with surgical precision while keeping it open for business with limited, conditional access

Unleash and accelerate data lake value creation with new insights when governed safely

Although data privacy mandates with new consumer rights will continue to evolve with new legislation, organizations can still simplify their compliance journey with a comprehensive data lake privacy governance approach that is flexible, while scaling through applying a repeatable framework. Any effort taken today not only helps to set the stage for safer data analytics but offers an opportunity to improve data quality and lower risk by eliminating or disposing of data that takes up valuable resources or creates liability.

Approaching data lake privacy governance as a “glass half full” opportunity to help accelerate business value creation and optimization—rather than a cost of doing business—is an ideal mindset. In the long term, having a business acceleration point of view can help in winning over internal stakeholders—such as chief data officers, line of business owners, and marketing teams—who need to see privacy risk reduction as an opportunity to drive better business outcomes. With the California Consumer Privacy Act taking effect on January 1, 2020, and with other state mandates to follow, now is the best time to re-evaluate data governance goals and organizational readiness for unleashing data lakes with trust and confidence.

Want to learn more about unleashing data lake governance for value creation? Then register for our Data Lake Governance Virtual Summit in your region:

First Published: Nov 12, 2019