Microsoft Fabric Updates Blog

Store and access your Iceberg data in OneLake using Snowflake and shortcuts

Microsoft Fabric is a unified, SaaS data and analytics platform designed for the era of AI. All workloads in Microsoft Fabric use Delta Lake as the standard, open-source table format. With Microsoft OneLake, Fabric’s unified SaaS data lake, customers can unify their data estate across multiple cloud and on-prem systems.

This past May, we announced the expansion of our partnership with Snowflake to include support for Apache Iceberg formatted data in OneLake and bi-directional data access between Snowflake and Fabric.  

Announcing Public Preview: Today we are thrilled to announce that customers can now consume Iceberg-formatted data across Microsoft Fabric with no data movement or duplication! With our latest update, customers can use OneLake shortcuts to simply point to an Iceberg table written using Snowflake or another Iceberg writer, and OneLake does the magic of virtualizing that table as a Delta Lake table for broad compatibility across Fabric engines. Furthermore, we’re excited to announce a step forward in our integration with Snowflake, in which Snowflake has added the ability to write Iceberg tables directly to OneLake.

It’s easy to get started, and we have much more coming soon. Try this feature today!

Get started today

To use your existing Iceberg data in Fabric, it’s just a matter of creating a OneLake shortcut to that data. For full instructions, see our Getting Started guide. Here are the basic steps:

  1. Find where your Iceberg tables are stored. This could be in any of the external storage locations supported by OneLake shortcuts including Azure Data Lake Storage, OneLake, Amazon S3, Google Cloud Storage, or an S3 compatible storage service.
  2. In your Fabric lakehouse, create a new shortcut in the Tables area of a non-schema-enabled lakehouse.
  3. For the target path of your shortcut, select the Iceberg table folder. This is the folder that contains the ‘metadata’ and ‘data’ folders.
  4. That’s it! Once your shortcut is created, you should automatically see this table reflected as a Delta Lake table in your lakehouse, ready for you to use throughout Fabric.

We’re working on full support for all Iceberg data types, and as this is a Public Preview, there are some temporary limitations documented here.

Snowflake integration

Snowflake already allows users to write Iceberg tables to Azure Data Lake Storage, Azure Blob Storage, Amazon S3, and Google Cloud Storage. With today’s announcement, Snowflake is releasing the ability for Snowflake on Azure users to write Iceberg tables to OneLake. This is another key step in the partnership we announced at Microsoft Build earlier this year.

For those who are familiar with writing Iceberg tables to Azure Storage from Snowflake on Azure, you can simply update your code to use a OneLake path, grant your Snowflake account’s identity access to OneLake, and write your Iceberg tables. For detailed guidance, see the instructions here.

How does this work?

Apache Iceberg tables can be used across Fabric workloads through a feature called metadata virtualization, which allows Iceberg tables to be interpreted as Delta Lake tables from the shortcut’s perspective. Behind the scenes, this feature utilizes Apache XTable for table format metadata conversion.

When you create a shortcut to an Iceberg table folder, OneLake automatically generates the corresponding Delta Lake metadata (the Delta log) for that table, making that Delta Lake format accessible through the shortcut. When updates are made to an Iceberg table, fresh Delta Lake metadata is served through the shortcut upon future requests.

What’s next?

As we gather feedback during this Public Preview, our integration with Snowflake will continue with some key new features, including:

  • Automatic conversion of Delta Lake formatted tables to Iceberg
  • Converting tables that are directly written to OneLake
  • Schema-level shortcuts – one shortcut, multiple Iceberg tables
  • Deeper integration with Snowflake including a dedicated Snowflake data item in Fabric to automatically sync Iceberg and Delta tables

Zugehörige Blogbeiträge

Store and access your Iceberg data in OneLake using Snowflake and shortcuts

Oktober 31, 2024 von Jovan Popovic

Fabric Data Warehouse is a modern data warehouse optimized for analytical data models, primarily focused on the smaller numeric, datetime, and string types that are suitable for analytics. For the textual data, Fabric DW supports the VARCHAR type that can store up to 8KB of text, which is suitable for most of the textual values … Continue reading “Announcing public preview of VARCHAR(MAX) and VARBINARY(MAX) types in Fabric Data Warehouse”

Oktober 29, 2024 von Dandan Zhang

Managed private endpoints allow Fabric experiences to securely access data sources without exposing them to the public network or requiring complex network configurations. We announced General Availability for Managed Private Endpoint in Fabric in May of this year. Learn more here: Announcing General Availability of Fabric Private Links, Trusted Workspace Access, and Managed Private Endpoints. … Continue reading “APIs for Managed Private Endpoint are now available”