Healthcare has long been a top priority for citizens and governments alike, and for all the right reasons - from curtailing the cost of healthcare, delivering optimal patient care, protecting patient data, to IT modernization and digital transformation of our healthcare system. After all, nothing is more important than our health. And this realization often becomes more acute in our collective consciousness when we’re facing a medical crisis or during a pandemic like the one we are witnessing today with COVID-19.
I have had the good fortune to work with payers and providers globally over the past 20 years on all things related to data management and analytics. The advancements the industry has made in recent years on its digital transformation journey has sparked great excitement and hope for what lies ahead.
Recently, I had the opportunity to speak with Dharam Padhaya, Senior Principal Architect, Data Management and Analytics from Hackensack Meridian Health, to get his perspective on the challenges and opportunities stemming from data in healthcare as well as Hackensack’s digital transformation initiatives and their use of Informatica Enterprise Data Catalog. During our discussion, we tackled a number of questions. Following are a few highlights from our conversation.
Data is indisputably the lifeblood for any healthcare institution. According to IDC, the global healthcare data will surpass 2K exabytes this year.1 While the opportunities presented by data in healthcare are immense – such as precision medicine, remote patient monitoring, genome sequencing in clinical trials as well as analysis of massive volumes of data for diagnosing and finding cures to chronic illnesses – the challenges that come with a highly complex data landscape and data sprawl are equally significant.
For instance, our data is spread across hundreds of systems from electronic medical record (EMR) systems, medical imaging systems, financial systems, data warehouses and databases, among other data sources. It is very challenging in this type of environment to know what data we have, where it resides, who owns what datasets, and whether the data is certified. We also need to ascertain who should have access to which datasets. For instance, is the data highly sensitive such as protected health information (PHI) and personally identifiable information (PII)? And lastly, what are all the data dependencies associated with each dataset? In healthcare, being able to rapidly discover and profile all your data and understand the relationships in data at a granular level is critical.
To truly extract value from your data, you must first catalog all the data. And this is why an enterprise data catalog is an essential component for any digital transformation initiative, including ours.
Enterprise Data Catalog enables us to achieve both scale and speed in discovering, understanding, validating, and scanning millions of datasets regardless of where that data resides. The advanced machine learning algorithms enable automatic classification and identification of domains and entities. Moreover, the end-to-end data lineage and impact analysis capabilities allow us to develop a comprehensive understanding of our data and all the data dependencies across the data pipeline from source to target. Our IT and business users, such as data stewards and data analysts, can more easily discover, validate, and collaborate on data. The operational efficiencies such as time and cost-savings can be significant over time. For instance, it would be virtually impossible to discover and understand relationships in data with accuracy across petabytes of data spread across 200 data sources if one had to attempt it manually. Think of the time, cost, and effort that would entail as well as the associated operational risks.
There were a number of factors we took into consideration. First, Informatica Enterprise Data Catalog is a leader in the metadata management category. Gartner’s Magic Quadrant for Metadata Management Solutions2 further corroborates that. Second, we were looking for an integrated platform approach to data management and Enterprise Data Catalog supports that very nicely. For instance, the catalog not only integrates well with other Informatica products that we are currently using such as Axon Data Governance, but also a wide array of third-party products that are in our environment including Tableau. After careful evaluation, we also felt that the semantic search feature, AI-enabled automation powered by the Informatica CLAIRETM AI-engine along with the data lineage and impact analysis capabilities met our requirements.
Currently we are scanning terabytes of datasets from over 15 data sources, including our EMR system (EPIC). We are taking a phased approach to deployment given our complex data landscape. Over time, we expect to add several more data sources, thus increasing the volume of datasets scanned. We also plan on making the catalog available to more users including some of our data analysts and data stewards. Our current use case for the catalog is predominantly centered around data governance and compliance. However, we plan on expanding the use of the catalog to support more use cases in the area of self-service analytics and data science in the near future.
To learn more about data cataloging, we recommend the eBook, “Drive Your Business Forward with a Catalog of Catalogs.”
1 How CIOs Can Prepare for Healthcare “Data Tsunami”
2 Gartner, Magic Quadrant for Metadata Management Solutions, Guido De Simoni, Mark Beyer, Ankush Jain, 16 October 2019
Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner's research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.