Author: Matillion
Date Posted: Mar 22, 2025
Last Modified: Mar 22, 2025
Github Action Publish Artifact
Publish Data Productivity Cloud pipelines as an Artifact with this Github Action
The attached main.yml file contains a Github Action which uses the Data Productivity Cloud Public API to upload all pipeline files in a project and publish them as an artifact to a given environment.
Setup in GitHub
Add the YAML workflow file main.yml to .github/workflows/ in the root of your repository.
Add the following secrets to the repository under Settings > Secrets and variables > Actions:
MATILLION_PUBLIC_API_CLIENT_ID
: Client ID for Matillion API access.MATILLION_PUBLIC_API_CLIENT_SECRET
: Client Secret for Matillion API access.
Add the following repository variables under Settings > Secrets and variables > Actions:
MATILLION_PROJECT_ID
: The ID of the Matillion project you are working with.
Pipeline Setup
The workflow includes several hardcoded environment variables that need to be configured appropriately for your team and project. Below is an explanation of each variable, its purpose, and how to configure it:
Versioning
VERSION_PREFIX
:v
- Purpose: Sets the prefix for semantic versioning (e.g., v1.0.0)
- Default:
v
- Customization: Update this if your versioning scheme requires a different prefix (e.g.,
release-
for release-1.0.0).
Authentication
CLIENT_ID
: ${{ secrets.MATILLION_PUBLIC_API_CLIENT_ID }}
- Purpose: Matillion API Client ID for authentication.
- Source: Stored securely in GitHub Secrets.
- Setup:
- Go to Settings > Secrets and variables > Actions.
- Add
MATILLION_PUBLIC_API_CLIENT_ID
with the Client ID from your Matillion account.
CLIENT_SECRET
: ${{ secrets.MATILLION_PUBLIC_API_CLIENT_SECRET }}
- Purpose: Matillion API Client Secret for authentication.
- Source: Stored securely in GitHub Secrets.
- Setup:
- Go to Settings > Secrets and variables > Actions.
- Add
MATILLION_PUBLIC_API_CLIENT_SECRET
with the Client Secret from your Matillion account.
API Endpoints
TOKEN_URL
: https://id.core.matillion.com/oauth/dpc/token
- Purpose: Matillion OAuth2 token endpoint.
- Default: This URL is specific to Matillion’s authentication system and should not need modification.
PROJECTS_URL
: https://eu1.api.matillion.com/dpc/v1/projects
- Purpose: Base URL for Matillion API requests.
- Default:
https://eu1.api.matillion.com/dpc/v1/projects
- Customization:
- If your Matillion account is in a different region, update this URL (e.g.,
https://us1.api.matillion.com/dpc/v1/projects
for US regions).
- If your Matillion account is in a different region, update this URL (e.g.,
Project and Environment
PROJECT_ID
: ${{ vars.MATILLION_PROJECT_ID }}
- Purpose: The ID of the Matillion project being managed by this workflow.
- Source: Stored in GitHub Variables.
- Setup:
- Go to Settings > Secrets and variables > Actions > Repository Variables.
- Add
MATILLION_PROJECT_ID
with your project’s ID from Matillion.
ENVIRONMENT_NAME
: DataOps Whitepaper-production
- Purpose: Human-readable name of the environment where pipelines are deployed.
- Customization:
- Replace
DataOps Whitepaper-production
with your target environment’s name.
- Replace
ENVIRONMENT_NAME_URL_ENCODED
: DataOps%20Whitepaper-production
- Purpose: URL-encoded version of the
ENVIRONMENT_NAME
for API calls. - Customization:
- Replace
DataOps%20Whitepaper-production
with the URL-encoded version of your environment name. - Use an online tool to encode spaces and special characters (e.g.,
My Environment
→My%20Environment
).
- Replace
Test Pipelines
TEST_PIPELINE_NAMES
: dataops-orchestration-pipeline,test-pipeline-2
- Purpose: List of pipelines to verify and execute during the testing phase of the pipeline.
- Customization:
- Provide a comma-separated list of pipeline names (e.g.,
pipeline1,pipeline2
).
- Provide a comma-separated list of pipeline names (e.g.,
Execution Timing
CHECK_INTERVAL
: 15
- Purpose: Time (in seconds) to wait before checking pipeline statuses
- Default:
15
- Customization:
- Adjust this based on the typical runtime of your pipelines to balance efficiency and API rate limits.
Note: It currently does not poll, it only checks once, so this is effectively a timeout for the Test pipeline step.
Overview of the GitHub Action
This GitHub Action automates the process of validating, deploying, and testing Data Productivity Cloud pipelines on the main branch.
It runs whenever a commit is pushed or merged to the main branch (and can also be triggered manually).
It consists of the following jobs:
Validate Pipelines
- Checks out the repository.
- Validates YAML files using yamllint.
Execute Pipelines
Runs only after the validate-pipelines job succeeds.
Performs several tasks:
- Calculate Next Version: Determines the next semantic version by incrementing the patch version.
- Generate Access Token: Authenticates with Matillion and generates a token for subsequent API calls.
- Upload Artifacts and Publish to Default Environment: Uploads .orch.yaml and .tran.yaml files to Matillion as part of the artifact deployment.
- Tag Main Branch: Tags the main branch with the new version using the calculated semantic version.
- Verify Test Pipelines: Ensures the specified test pipelines are published in the environment.
- Execute Test Pipelines: Executes the pipelines and retrieves execution IDs.
- Wait for Pipeline Completion: Monitors the status of executed pipelines, ensuring all complete successfully.
- All Pipelines Completed Successfully: Outputs success if all pipelines execute as expected.
Step-by-Step Explanation of Each Job
Validate Pipelines
- Purpose: Ensures the syntax of YAML files in the repository is valid.
- Command: Uses yamllint to validate all YAML files.
Calculate Next Version
- Purpose: Calculates the next semantic version by inspecting existing tags and incrementing the patch version.
- Logic:
- Fetches all tags.
- Identifies the latest tag (e.g., v1.0.0).
- Increments the patch version to e.g. v1.0.1.
Generate Access Token
- Purpose: Authenticates with Matillion and generates a bearer token for API calls.
Upload Artifacts and Publish to Default Environment
- Purpose: Uploads files (.orch.yaml, .tran.yaml, .py and .sql) to Matillion.
- Implementation:
- Finds all relevant files in the repository.
- Constructs a single POST request with a multipart payload containing all files.
- Includes headers for versionName, environmentName, and commitHash.
Tag Main Branch
- Purpose: Tags the main branch with the calculated version.
- Implementation:
- Uses the GitHub API via actions/github-script to create the tag.
Verify Test Pipelines
- Purpose: Ensures the specified test pipelines are published in the environment.
- API: Calls the Matillion published-pipelines API and checks if all expected pipelines exist.
Execute Test Pipelines
- Purpose: Executes the specified pipelines in the given environment.
- API: Calls the Matillion pipeline-executions API to initiate execution for each pipeline.
- Outputs: Collects pipelineExecutionId for each executed pipeline.
Wait for Pipeline Completion
- Purpose: Monitors the status of executed pipelines.
- API: Waits the defined amount and then calls the Matillion pipeline-executions status API to check execution status.
All Pipelines Completed Successfully
- Purpose: Confirms all pipelines executed successfully.
- Output: Displays a success message if all pipelines succeed.
How to Run
Trigger Automatically:
- Push or merge changes to the main branch to automatically trigger the workflow.
Trigger Manually:
- Go to the Actions tab in your repository.
- Select the workflow and click Run workflow.
Downloads
Licensed under: Matillion Free Subscription License
- Download main.yml
- Note: Github Action