Author: Matillion
Date Posted: Aug 22, 2024
Last Modified: Mar 28, 2025
Load streaming data from cloud storage
Process the latest files from cloud storage to maintain tables in your cloud data platform.
If you have configured a streaming pipeline with Amazon S3 or Azure Blob as the destination, these Data Productivity Cloud pre-built pipelines can be used to load the Avro files into Snowflake, Databricks or Amazon Redshift.
Requirements
To load the files into Snowflake, you can use a Data Productivity Cloud project configured with either a Full SaaS or Hybrid SaaS agent.
To load the files into Databricks or Amazon Redshift, you must use a Data Productivity Cloud project configured with a Hybrid SaaS agent.
Installation
- Open a branch on your Data Productivity Cloud project.
- Click “Add > Browse Exchange”

- Search for “streaming” and select the pipeline to import it into your project.

- You should now have a folder named “Imported from Exchange > Load Streaming Data from Cloud Storage” containing the latest versions of these pipelines.

Usage
Open the orchestration pipeline “Imported from Exchange > Load Streaming Data from Cloud Storage > Example”.
Follow the instructions on the notes in this pipeline to copy the Run Orchestration component into your own orchestration pipeline, and configure it to load your Avro files into your data platform.
See the Matillion documentation for full descriptions of the parameters.

Downloads
Licensed under: Matillion Free Subscription License
- Download pre_built_pipelines_streaming_databricks_20250328.zip
- Target: Databricks
- Download pre_built_pipelines_streaming_redshift_20250328.zip
- Target: Redshift
- Download pre_built_pipelines_streaming_snowflake_20250328.zip
- Target: Snowflake