Matillion ETL Job

Example Matillion ETL jobs that maintain Type 1 and Type 2 Slowly Changing Dimension (SCD) tables.

Slowly Changing Dimensions

This is a set of related jobs that demonstrate various techniques for maintaining Type 1 and Type 2 SCD tables.

Data Model

The example is a general purpose “event” dimension, where

Running these jobs will create three tables and one sequence in your default database and schema.

EVENT_DIM_T1

A type 1 “event” dimension, with no history tracking.

ColumnDescription
event_skSurrogate key for the dimension. This would be used as the foreign key on a fact table. Enforced unique. In these examples the value is maintained manually
event_nkNatural key of the event
event_typeThe event type
event_categoryThe event category

Like any type 1 dimension, there are two alternaltive unique identifiers

EVENT_DIM_T2

A type 2 “event” dimension, with history tracking.

ColumnDescription
event_skSurrogate key for the dimension. This would be used as the foreign key on a fact table. Enforced unique, and maintained automatically using the SEQ_EVENT_DIM sequence
event_nkNatural key of the event
event_typeThe event type
event_categoryThe event category
valid_fromTimestamp this version of the record became valid
valid_toTimestamp this version of the record ceased to be valid. Within every event_nk, the combination of valid_from and valid_to should form a continuous, non overlapping range of time
is_latestBoolean flag indicating if this version of the record is currently the latest
versionAn ascending integer version number. Within every event_nk, the record with the highest version has is_latest set to TRUE

Filtering this table on is_latest = TRUE should produce the same data as the type 1 dimension.

Like any type 2 dimension, there are several unique identifiers

STG_EVENT_DIM_EXAMPLE

Used as the data source for the dimension tables. In a medallion data architecture this would be a bronze layer object (two-tier model) or a silver layer object (three tier model).

ColumnDescription
event_nkNatural key of the event. Enforced unique
event_typeThe event type
event_categoryThe event category

The Example SCD run job creates a new set of randomized test data in this table.

How to use the jobs

There are five jobs in the download:

Start with Example SCD begin setup, and run it once. It will create all the database objects. If you run it again a second time, it will re-create them all empty again.

Example SCD begin setup

Now run Example SCD run to send the first set of test data through the transformation jobs. It populates the type 1 and the type 2 dimension tables from the same set of test data. Check the data in the dimension tables by using a SQL client, or the sample tab of one of the Table Input components in the transformation jobs.

For the type 2 dimension, you can switch between the high- and low-code variations by updating the Run Transformation component. The high code version is slightly faster at runtime, but requires hand coding to maintain. The results in the tables are identical whichever one you use.

Example SCD run

Run Example SCD run several times to build up some history in the type 2 dimension.

Take a look at how the transformation jobs work. You are free to copy and customize them for your own needs.

Downloads

Licensed under: Matillion Free Subscription License

Download METL-sf-1.61.6-slowly-changing-dimensions.json


Author: Matillion