Elevate Your Cortex LLM Experience with Informatica SQL ELT

Last Published: Aug 30, 2024 |
Karthikeyan Mani
Karthikeyan Mani

Principal Product Manager

LLMs at the Core of Ambitious AI-Powered Analytics Projects

The growing popularity of AI-powered analytics may lead you to explore the power of large language models (LLMs). These models translate natural language into a specialized syntax that AI systems can understand and incorporate into their knowledge bases. This enriched knowledge base enables the AI to respond to user queries effectively.

Within the LLM world, Snowflake Cortex AI delivers easy, efficient and trusted enterprise AI to thousands of organizations — making it simple to create custom chat experiences, fine-tune best-in-class models and expedite no-code AI development. Cortex provides built-in, out-of-the-box functions such as COMPLETE, SENTIMENT, SUMMARIZE, EXTRACT ANSWER and TRANSLATE, which business users can use for various analytics use cases.

For instance, the system can translate content from any language into English to facilitate further analysis and ranking. It can also summarize textual content, analyze it and assign scores based on predefined parameters. Moreover, the system can automatically generate relevant and contextual responses to customer service inquiries by following a series of prompts, and it can even complete these prompts if directed to do so.

With Cortex AI, you can automate and scale tasks, such as text scoring using sentiment analysis, summarize the given text, translate the given text from any supported language and complete the text based on the requested prompt.

Why LLMs Need Access to High-Quality Streaming Data

Cortex streamlines data processing, analytics and insight generation by utilizing machine learning (ML) and AI techniques. This assists businesses in efficiently managing and understanding vast quantities of unstructured data. Cortex not only analyzes and categorizes this data but also scores it. Additionally, it provides intelligent responses to free-flow questions from users.

But what powers the LLM? Even the most advanced, intelligent LLM needs access to high-quality streaming data to perform, without which it cannot train the models or deliver performance. That means the vast volumes of inbound unstructured and multi-format data needs to be brought together, on an ongoing basis, to a common platform for the LLMs to make sense of it and use it for a range of analytics use cases.

Access to continuous, high-quality data necessitates an efficient data integration process that cleans, transforms and normalizes raw data into a format suitable for LLM operations. Robust data pipelines are essential as they streamline the transfer of data to feed these models. These pipelines act as a bridge by preprocessing, standardizing and structuring the data to ensure it is optimized for LLM functions. Crucial to functions such as sentiment analysis, translation or text summarization, data pipelines significantly enhance the accessibility and cost-effectiveness of LLM technologies.

How Informatica Ecosystems SQL ELT Optimizes and Elevates the Cortex Experience

If you already use Cortex AI, you can optimize and elevate outcomes with Informatica Ecosystem SQL ELT, which now supports Cortex AI Native GenAI Functions. This convenient integration gives Cortex AI guaranteed access to trusted data and unlocks the full potential of your generative AI projects.

With seamless data integration specifically for Cortex AI functions, create your ELT mapping using the Informatica no-code interface without leaving the Snowflake environment. Ecosystem SQL ELT can read and write data from cloud data warehouses or data lakes to targets in the same ecosystem and lets you incorporate Cortex AI Functions as part of your no-code data integration and data engineering pipelines running natively on the Snowflake AI Data Cloud.

This means you can develop efficient ELT pipelines that run directly on Snowflake compute resources. Not only will you leverage advanced Cortex AI capabilities directly within your data workflows, but you will also enhance the speed and performance of data processing for analytics while optimizing costs at any scale. 

Benefits of Informatica Ecosystem SQL ELT on Cortex AI

1. Resource Efficiency

Writing complex code or APIs to connect your data to Cortex not only takes up significant data engineering bandwidth but is also not reusable and often error-prone.

Instead, incorporate advanced AI functions directly into data pipelines with Informatica’s no-code SQL ELT. Simply add an expression transformation and choose the desired Cortex AI function, and the necessary scripts will automatically generate. These scripts establish connections between the two endpoints, execute the transformation and grant Cortex AI direct access to high-quality data — all within the Snowflake environment.

Figure 1: Leverage Cortex AI functions easily with a few clicks within your Snowflake environment.

Ecosystem SQL ELT uses the cloud data warehouse's compute capacity to read the data, perform any transformation and push the data to the target without the data ever leaving the ecosystem. This approach not only reduces the time to insight for AI-led analytics projects but also optimizes data engineering resources by streamlining processes.

2. Ease of Use

End users can consume the Cortex Functions directly from within their ecosystem without writing code, simply using the wizard-driven graphical user interface (GUI).  For example, a company may want to run a sentiment analysis based on customer comments and feedback for a specific product from a specific region, like Japan.

First, you would need to filter data for the specific product from the Japan region. Next, you must translate the filtered comments from Japanese to English, review the comments, analyze them and group them as per the defined scoring framework. For instance, comments with scores above 5 are positive, those in the middle are average and any below 2 are bad.

Figure 2: Informatica Ecosystem SQL ELT helps you leverage Cortex AI translate and sentiment functions to analyze, score and group customer comments, all within the Snowflake environment

The entire end-to-end process of Source-Filter-Translater-SentimentScore- Sentiment Group-Target is a simple, drag-and-drop workflow that requires no coding effort at any stage. The Informatica native SQL ELT no-code/low-code framework and simple GUI-based interface means anyone can build pipelines and use Cortex Functions — even without any programming or coding knowledge.

3. Operational and Cost Efficiency

Building the ELT mapping within the Snowflake environment means your data stays within the ecosystem. Ecosystem SQL ELT converts mapping logic to equivalent, optimized SQL queries that your cloud data warehouse executes. This happens without data movement, which would typically incur additional processing time and data transfer charges.

This method improves data processing performance and reduces network latency as data is processed close to the source. Removing unnecessary data movement also saves data transfer (egress) costs, as the data integration uses existing Snowflake credits.

4. Future-proof Ecosystem-native ELT

Informatica Ecosystem SQL ELT is ecosystem agnostic, which means the pre-built functionalities can handle complex transformations within virtually any cloud ecosystem — including Snowflake — so you can repoint connections easily in the future if you switch your cloud data warehouse. 

Get Started with Data Integration for Cortex AI Within Your Snowflake Ecosystem

IDMC users get automatic access to the new SQL ELT capability and can start working with it from within Snowflake immediately.

If you are new to Informatica or are not yet an IDMC user, unlock the full prowess of Cortex AI Functions with just a few clicks. Get the no-code Cloud Data Integration-Free (CDI-Free) service on Snowflake to easily load, transform and integrate your data and experience the fast, free and proven way to ensure your analytics and generative AI use cases are firmly grounded in accurate, relevant data on your cloud environment.

 

 

First Published: Aug 15, 2024