Merge Join
Description
The Merge Join step performs a classic merge join between data sets with data coming from two different input steps. Join options include INNER, LEFT OUTER, RIGHT OUTER, and FULL OUTER.
Note: In this step rows are expected in to be sorted on the specified key fields. When using the Sort step, this works fine. When you sorted the data outside of PDI, you may run into issues with the internal case sensitive/insensitive flag. Further information can be found on PDI-11440.
Note: If the key fields have the same name (e.g. ID), a second key field called $key_1 (e.g. ID_1) will be created in the result.
Options
The following table describes the options available for the Merge Join step:
Option |
Description |
---|---|
Step name |
Name of the step; this name has to be unique in a single transformation |
First Step |
Specify the first input step to the merge join (left) |
Second Step |
Specify the second input step to the merge join (right) |
Join Type |
Select from the available types of joins:
|
Keys for 1st step |
Specify the key fields on which the incoming data is sorted; click Get key fields to retrieve a list of fields from the specified step. |
Keys for 2nd step |
Specify the key fields on which the incoming data is sorted; click Get key fields to retrieve a list of fields from the specified step. |
To find out how to partition this step see this blog from Dan Keeley
Metadata Injection Support (7.x and later)
All fields of this step support metadata injection. You can use this step with ETL Metadata Injection to pass metadata to your transformation at runtime.