Matillion ETL Shared Job

Author: Matillion
Date Posted: Oct 27, 2023
Last Modified: Nov 9, 2023

Move S3 File

Move a file from one location in AWS S3 cloud storage to another, optionally changing the file name.

Moving an object in S3 requires a copy followed by a delete.

  • The copy will silently overwrite the target file if it already exists
  • It is possible in rare circumstances for the copy to succeed and the delete to fail, leaving behind two copies of the original file.

The shared job will fail if the source or target bucket is not known, if the source file does not exist, or if the AWS identity lacks the privilege to write to S3.

Parameters

ParameterDescription
Access KeyOptional AWS Access Key. Leave blank or set to - to authenticate using EC2 instance credentials (preferred)
Secret Access KeyOptional AWS Secret Access Key. Leave blank or set to - to authenticate using EC2 instance credentials (preferred)
Source Bucket NameThe source S3 bucket name (do not include the s3:// prefix, do not include the object path)
Source Object NameThe source S3 object name (including path)
Target Bucket NameThe target S3 bucket name (do not include the s3:// prefix, do not include the object path)
Target Object NameThe target S3 object name (including path)

Using instance credentials on AWS

When hosted on AWS, give the EC2 instance the permission to read and write to S3. Refer to the Prerequisites section for more information. This Shared Job will inherit the privileges. Set the Access Key and Secret Access Key parameters to a single dash, or leave them blank.

Move S3 File on AWS

Usage on Azure or GCP

When hosted on Azure or GCP, permissions can not be inherited. You must supply an Access Key and a Secret Access Key.

Move S3 File on Azure or GCP

Prerequisites

This shared job requires Python 3.8.

To avoid a ModuleNotFoundError, the following Python libraries must be available:

  • boto3

When running on AWS, permissions can be inherited from the Matillion ETL instance. This is preferred. Set the Access Key and Secret Access Key parameters to a single dash, or leave them blank in this case. Ensure that the EC2 instance credentials attached to your Matillion ETL instance include the privilege to read and write to S3. For more information, refer to the “IAM in AWS” section in this article on RBAC in the Cloud.

When running on other platforms, authentication is done using AWS access keys for programmatic access to AWS. Supply both the Access Key and the Secret Access Key parameters.


Downloads

Licensed under: Matillion Free Subscription License

  • Download move-s3-file.melt
    • Target: Any target cloud data platform
    • Version: 1.68.3 or higher

Installation Instructions

How to Install a Matillion ETL Shared Job