Artificial intelligence (AI) has the potential to transform industries and drive innovation in our constantly evolving technological landscape. However, its adoption is hindered by challenges, with managing data being a significant hurdle to overcome.
A recent McKinsey global survey found that “The share of organizations that have adopted AI overall remains steady, at least for the moment, with 55 percent of respondents reporting that their organizations have adopted AI.”1 The same survey also noted that “AI high performers are not immune to the challenges of capturing value from AI, the results suggest that the difficulties they face reflect their relative AI maturity, while others struggle with the more foundational, strategic elements of AI adoption.”2
So, to harness the boundless power of AI, we must first address the barriers of data accessibility, data quality and privacy. These challenges need to be addressed before we start building and teaching AI and machine learning (ML) models. Once we've done that, we can put these models into action.
Data Management: The Keystone Challenge in AI Adoption
When organizations decide to use AI, they face the challenge of dealing with a large amount of data and complex, performance-intensive workloads required for AI applications like generative AI. The current data platform and architecture are not equipped to keep up with the increasing number of AI use cases and the demand from businesses to become more efficient and automated.
“The most frequently cited technological inhibitor to AI/ML deployments is data management (32%), outweighing challenges for security (26%) and compute performance (20%), evidence that many organizations’ current data architectures are unfit to support the AI revolution.”3
S&P Global Market Intelligence, 2023 Global Trends in AI Report
It may be surprising to learn that 60% to 80% of AI projects fail.4 To better understand why, let’s look at the basics of AI development before we break down the key challenges of data management in AI adoption.
Back to Basics: AI Model Development
If you're interested in deploying AI or generative AI models, it's essential to understand the high-level steps involved. Figure 1 shows that the success of these models depends on collecting and preparing relevant data, extracting relevant features from a testing dataset and training the model. For best results, it's important to focus on these steps.
The data for AI — or rather, meaningful data for AI — is much more difficult to collect than you think. Now, let’s look at top data management challenges and how to overcome them.
1. Data Integration – The Shapeshifter
Data comes in many forms from different sources and various volumes. Numerous machine learning algorithms demand substantial data volumes to yield meaningful outcomes. Neural networks, for example, need a lot of training data. As the size of the network gets bigger, it needs more data to get good results. But it can be hard to get good data. Here are some of the key barriers that make it hard to gather meaningful data:
- Data silos
- Duplicate data
- Data incompatibility
- Complex ETL processes
This is why choosing the right data ingestion and data integration tools and platforms that align with your objectives is so important. You need to ensure they can handle various data formats and sources, offer real-time data integration capabilities and support scalability.
2. Data Quality – The Illusionist
Data quality — or the lack thereof — can lead to multiple issues during the AI adoption process. Poor-quality data can cause problems like inaccurate predictions and decisions, biases, wasted resources and sometimes even legal consequences. To avoid these issues, it’s crucial to have reliable data free from errors.
This illusionist disguises poor data quality, inconsistent data and missing values as reliable resources for AI models. According to BCG Consulting, unique, high-quality data is a primary source of competitive advantage in a generative AI world.5
As shown in Figure 2, there are four steps to start improving your data quality:
- Discover: Discovering data quality issues using data profiling by identifying data anomalies, inaccuracies and duplications. (For more information on how to get started with data profiling, please refer to this quick start guide. You can also try this interactive demo to see it in action.)
- Define: Design data cleansing and standardization rules based on underlying data issues and get them reviewed by data scientists and data/business analysts to include business-specific rules.
- Integrate: Integrate data quality rules into your data pipeline for AI.
- Monitor: Continually monitor and improve your data quality.
For more information, please refer to our eBook, The Art of Mastering Data Quality for AI and Analytics.
3. Data Privacy and Protection – The Guardian
The lack of oversight in data privacy and protection can pose a more significant challenge and lead to a loss of trust in AI adoption.
Generative AI poses privacy concerns as it deals with personal data and may produce sensitive content. Personal information such as names, addresses and contact details may be collected unintentionally when interacting with AI systems. Using personal data in generative AI algorithms could result in accidental disclosures or data misuse. It is essential to be aware of these risks when using generative AI.
Various compliance and regulations like the General Data Protection Regulation (GDPR) and the Health Insurance Portability and Accountability Act (HIPAA) help ensure that only the right people can access personal information and it remains safe.6 The EU Parliament has proposed the European Union AI Act (EU AI Act) after conducting a thorough review of various risk factors. The legislation establishes obligations for providers and users based on the risk associated with AI.7
Data access governance is foundational for solving the broad range of data privacy and challenges when adopting AI. The key areas data access governance addresses are:
- De-risk data to comply with AI and privacy policies and standards
- Enable AI and analytics
- Improve confidence in data use
4. Data Governance – The Watcher
Data governance is another key barrier due to the evolving nature of AI and the critical importance of data. Data governance challenges in AI include:
- Data Access and Control: Balancing the need for data access across different teams and roles while maintaining control and security is one of the critical data governance challenges. Data should be made available to those who need it while respecting data ownership and usage policies.
- Bias and Fairness: Data governance should address bias in AI systems, ensuring that training data is representative and free from biases. Fairness in AI outcomes is a governance concern, as biased models can have adverse real-world consequences.
- Ethical Considerations: Determining what data is ethically collected and used for AI purposes is critical for trusted AI adoption. This includes defining ethical guidelines for data handling and model behavior.
- Data Ownership and Rights: It’s crucial to clarify data ownership and establish rights for appropriate data use. Data should be managed with a clear understanding of who controls it and how it can be used.
- Change Management: Implementing data governance in AI requires changes to workflows and practices. Managing these changes and ensuring staff adhere to new governance policies is an ongoing challenge.
Organizations must develop comprehensive data governance frameworks tailored to AI to address these challenges. These frameworks should focus on ethical practices, data quality, privacy compliance and the secure, responsible use of data throughout the AI development and deployment process.
To learn more about a solution that provides all-in-one data discovery, data catalog, data governance, data lineage and access to trusted data, visit our Informatica Cloud Data Governance and Catalog solution page.
The Future of AI Adoption
Advancements in AI technology now have gained momentum with generative AI tools; however, adoption is still in its early stages. The key barriers to adoption and success will not go away overnight. But, with the right tools, strategies and a dedicated team, you can navigate the obstacles in a step-by-step fashion.
A unified and composable data management platform can not only accelerate your journey to successful AI adoption but will also help you prepare for future advancements.
Remember, it's a journey, not a destination. Embrace the quest to become a data hero, and you'll lead your organization toward a brighter, data-driven future.
1 QuantumBlack AI by McKinsey, The state of AI in 2023: Generative AI’s Breakout Year
3 S&P Global Market Intelligence, 2023 Global Trends in AI Report
7 European Parliament News, EU AI Act: first regulation on artificial intelligence