Why Pfizer Cares About Data Integration and ETL in the Cloud

Nov 30, 2021 |
Pratik Parekh

GVP and GM, Cloud Modernization

Hardly a day goes by without news of Pfizer’s latest life-saving medicines and therapies. But behind the company’s cutting-edge scientific research is a similarly innovative approach to data management, one that ensures that trusted insights are available at every step in the pharmaceutical value chain, from drug discovery and testing to global manufacturing and delivery.

At our recent Cloud Modernization Summit with Informatica and Snowflake, I had the pleasure of speaking with John Leone, a 20-year veteran at Pfizer who leads its data and analytics programs ranging from ETL design and development to data warehouse and business intelligence projects. Recently, he’s been on the front lines of Pfizer’s journey to modernize its analytics platform in the cloud. In our conversation, John talks about the key drivers behind this decision, the benefits of automation and cloud-native integration, as well as best practices for migrating its data management from PowerCenter to Informatica Intelligent Cloud Services (IICS).

The following interview has been edited for length and clarity.

Pratik Parekh: Can you tell us about your role at Pfizer and the company’s vision for data?

John Leone: I'm part of Pfizer Digital, and in particular a group that provides data analytics services globally. As a company, we’re always striving to bring medicines to the world faster, engage patients and physicians better, and digitize drug discovery and development across all areas – oncology, ophthalmology, virology, almost every critical area of healthcare. All those initiatives require data and analytics. So, the goal is to bring the right kind of data to the right user at the right time. This includes everything from Informatica and Snowflake to custom built solutions, such as a major initiative underway called Pfizer Insights.

Pratik Parekh: What are some of the goals driving your cloud modernization program?

John Leone: One is cost savings. We're a nimbler company than we were even a few years ago, and we’re always trying to do more with less. And then there is the trifecta of cloud, which is enabling agility, innovation, scalability. We’re completely retooling how we work, transforming from an IT shop to a software development shop, changing the kind of talent we hire, and how we upscale Pfizer resources. We're essentially reinventing ourselves.

From a technology standpoint, we have been cloud-first for analytics for a number of years, but we're just embarking on modernizing our on-prem portfolio. The goal is, in the next few years, to essentially become to the extent we can a cloud-only company. There’s always going to be complex business processes, or certainly manufacturing sites, that need on-prem solutions. So, we'll always be hybrid, but we're on our way to a cloud-native company.

Pratik Parekh: How has your relationship with Informatica evolved throughout this journey?

John Leone: We've been a PowerCenter shop for the past 20 years. And PowerCenter still powers a lot of our core business processes. We have probably 60+ PowerCenter applications operating globally. We process hundreds of billions of rows a month with PowerCenter. But we've also been a long-term IICS shop. We've used IICS since 2015. We brought it in primarily for Salesforce integration, an easy-to-use tool for that. And then the past couple of years, we've been deploying more sophisticated, complex solutions in manufacturing, in commercial and order management.

Pratik Parekh: What did cloud migration look like for manufacturing use cases?

John Leone: PGS stands for Pfizer Global Supply Chain, and that's our manufacturing and distribution function. A few years ago, we embarked on an initiative to consolidate data analytics into a data lake, and they're well on their way. More recently, they decided to go with Snowflake and start to migrate their on-prem analytics to the cloud. The reality is this data warehouse has been around for years, has thousands of mappings and sessions and workflows. We've invested years into the application and business logic. To migrate it to the cloud was formidable, both from a cost and timeline perspective.

A few months ago, we engaged Informatica as a cloud data warehouse modernization service. We met with them and performed an assessment, and they determined that nearly 99% of mappings, in workflows and sessions, converted automatically through the tool that Informatica has built. And through the automation provided by the cloud data warehouse modernization service, we're able to convert virtually all the code. I think what also made us feel more confident is that we're not just getting the tool. We're getting a service. We knew that if we encountered issues, either with the tool or with converting to IICS, that they'd be able to help us.

Pratik Parekh: How do you define the value or success of these programs?

John Leone: There's a lot of complexity in our solutions, and especially ones that have been in existence for years. To be able to retain all that business logic without losing a lot of fidelity, that's number one. I think also knowing that what we got back as Informatica's IICS code was tested and demonstrated to meet standards for being able to run and for reusability was a great win for us. And the reality is that they've been able to convert the code and provide it back to us on even better timelines than we thought, which has reduced the level of effort we've had to spend on making it work at Pfizer.

Pratik Parekh: With so many technology choices out there, what were you looking for in your ecosystem partners?

John Leone: Snowflake is considered a strategic technology. It really is going to help enable those goals we have around modernization, whether that be the ability to scale up data processing, data sharing, or data replication; it’s also fairly maintenance free. And IICS has very strong integration with Snowflake; it enables you to quickly load and unload data and push down your logic into Snowflake. It's been a really strong fit. There are a number of other reasons why we believe IICS is our preferred data integration tool. It’s also cloud-agnostic. AWS is here today, but we're venturing into Google Cloud, and Azure, and obviously Snowflake. So, we need a tool that can grow with us, that’s multi-cloud, that’s future-proof. Parekh: How did your developer workforce manage the transition from using PowerCenter to picking up IICS?

John Leone: It's been a very seamless transition going into the cloud. We were moving from PowerCenter and Oracle to IICS and Snowflake. Our sources were also changing, from pointing to the actual source systems to the manufacturing data lake. But the concepts are similar, and you still have things like mapping and sessions. The GUI will feel familiar. There are even some repetitive tasks that in PowerCenter still took some deal of coding, and now are more task-driven or wizard-based in IICS. So, in a lot of ways, it simplifies the development process.

Even though PowerCenter has been around forever and has lots of features you can use in different ways, in converting those jobs to IICS, we’ve been able to map it without any issues. And we really haven't had a lot of training sessions. I think that’s a good indicator of where IICS is now from a feature standpoint and demonstrates that it's a pretty straightforward transition from PowerCenter to IICS.

Pratik Parekh: Are there any key learnings you’d like to share with those embarking on a similar journey from PowerCenter to IICS?

John Leone: One of the big ones is to leave time in your project for the unexpected. There will be some hiccups that you'll need to get through. Also, know that in general, not everything is a good candidate for moving from PowerCenter to IICS. Maybe you have a project where perhaps rewriting it makes sense, or maybe it's a short-lived solution that you just need temporarily to work in the cloud, and so you can band-aid it. But if you have a lot of business logic and a lot of investment in your on-prem solutions, there's really no other way to get it to the cloud. I don't see another option in the marketplace. Rewriting would be cost-prohibitive and take years.

Learn How to Accelerate Your Cloud Journey

To learn more about data-driven transformation, visit our life sciences page to see how clinical and pharmaceutical leaders are optimizing R&D, supply chain, and commercial initiatives through AI-powered data management. And if you’re looking to migrate to a cloud data warehouse, our PowerCenter Modernization resource center can help you get started, with faster time to value and automated, risk-free conversion of PowerCenter mappings to IICS.