Fivetran Introduces Managed Data Lake for AI Workloads

Data teams are jumpstarting their generative AI and LLM initiatives with Fivetran Managed Data Lake Service by improving the quality, completeness and timeliness of enterprise data and reducing the complexity of managing data integration.

Fivetran, the global leader in data movement, announced the general availability of the Fivetran Managed Data Lake Service, designed to automate and simplify data lake management for businesses of all sizes. Fivetran supports over 500 pre-built as well as custom data sources, seamlessly integrating them into any major data lake destination while employing powerful change data capture, normalization, compaction and deduplication processes. Fivetran’s Managed Data Lake Service is currently available on Amazon S3, Azure Data Lake Storage (ADLS) and Microsoft OneLake.

The Fivetran Managed Data Lake Service simplifies data lake management by automatically converting customer data to popular open formats (i.e. Apache Iceberg or Delta Lake) before landing it in the data lake. When combined with Fivetran’s ongoing table management and maintenance, customers get the easy queryability and ease of use of a cloud data warehouse, with the flexibility and scale of a data lake. No other data provider can manage data lakes in this way, which means Fivetran customers benefit from the low cost of data lakes and the structure and reporting capabilities of data warehouses. Users can easily build out their data lake with query-ready data that can be read by data warehouses with their external table feature without having to move or duplicate data records in multiple locations. This supports a number of use cases including analytical, operational and genAI workloads.

The new Fivetran Managed Data Lake Service differentiates itself by not only converting data and centralizing it in the lake but also providing an end-to-end data lake management service that automates low-level data management tasks entirely. “Fivetran does the heavy lifting of change data management, PII detection, deduplication and other low-level table maintenance so that developers don’t waste time on work that can be automated,” said George Fraser, Fivetran CEO. “We hope to make business users and data scientists alike more productive by providing clean, centralized, optimized data from any source.”

This level of automation and maintenance is crucial for many organizations. As Nick Chmura, Head of Data at Luma Financial Technologies, explains, “Automated table maintenance is the killer feature for us with Fivetran because we have so many different source connectors. To try to build change data capture and manage that for everything…would be prohibitively costly in terms of time.”

Also Read: Persistent achieves Snowflake Premier Services Partner status, expanding its data management and analytics capabilities

Fivetran Managed Data Lake Service helps transform traditionally ungoverned data lakes into organized, governed, continuously optimized data stores. With native integrations with data catalogs including AWS Glue, Databricks Unity Catalog and Microsoft Purview, users can quickly discover, access and govern key datasets from the lake. From there, users can query and modify the data with Python, SQL or other supported languages by leveraging compatible compute engines like Databricks, Snowflake, Starburst or Redshift. Or, they can transform the data with tools like dbt, visualize it with Power BI or build and deploy AI/ML models with tools like AWS Sagemaker, Azure Machine Learning or Databricks Mosaic AI.

Fivetran Managed Data Lake Service supports over 500 data sources, including on-premises and cloud databases like Postgres, MySQL, Oracle and SAP, SaaS applications, data warehouses, events and files. Fivetran can also create custom connectors, ensuring support for any data source without requiring precious engineering resources for pipeline management or connector development. This broad portfolio of source compatibility enables customers to unify their data in the data lake, regardless of where it currently resides.

“We are very excited about Fivetran supporting Delta Lake as a direct destination,” said Himanshu Raja, Director of Product, Databricks. “With this new capability, customers can now use Fivetran to build an open lakehouse with Delta Lake powered by the Databricks Data Intelligence Platform. We are also very excited about the upcoming Fivetran integration with Unity Catalog to provide out-of-the-box governance and security for all Fivetran-generated tables.”

The benefits of Fivetran Managed Data Lake Service include:

Empowering business users and data scientists with centralized, democratized, query-ready data that adds context, invites insights and drives data discovery.
Enhanced operational efficiency by automatically converting your data to open table formats (Delta Lake / Apache Iceberg) with robust data cataloging and governance features.
Reduced developer workload by having Fivetran do the heavy lifting of table updates, PII detection, deduplication and other low-level table maintenance tasks that can be automated.
Reduced costs by automating data migration away from legacy data warehouses that lock you in with proprietary data formats.
Provide peace of mind with worry-free data replication that ensures your datasets always arrive clean and complete, with every change captured.

In response to the surging demand for advanced AI, Fivetran has seen substantial growth and customer interest in data lake destinations, particularly among large enterprises. This increase in demand underscores the critical need for cost-effective, flexible data architectures that enable customers to achieve success with AI and machine learning.

Others in the industry are also seeing a demand for new architectures to meet evolving needs. For example, Starburst continues to see open architecture adoption gain momentum and Fivetran’s adoption of Iceberg in its Managed Data Lakes Service further validates the Icehouse architecture”, said Anders Holden, Director of Product Management at Starburst Data. “This service simplifies the ingestion process and removes complexity around data lake management. By using Starburst to run fast SQL analytics on the data lake, businesses will see a faster time to insights and a streamlined data pipeline process, while saving time and reducing costs.”

The Fivetran Managed Data Lake Service is available immediately. We fully automate and manage data standardization as we move it to data lake destinations, making it available to businesses to find new ways to innovate with data.

Source: Businesswire

Fivetran Announces New Managed Data Lake Service to Support Large Data Volumes and AI Workloads

About Us

Latest

Popular

Quick Link