Description
Merge rows allows you to compare two streams of rows. This is useful for comparing data from two different times. It is often used in situations where the source system of a data warehouse does not contain a date of last update.
The two streams of rows, a reference stream (the old data) and a compare stream (the new data), are merged. Only the last version of a row is passed to the next steps each time. The row is marked as follows:
- "identical" - The key was found in both streams and the values to compare are identical;
- "changed" - The key was found in both streams but one or more values is different;
- "new" - The key was not found in the reference stream;
- "deleted" - The key was not found in the compare stream.
The row coming from the compare stream is passed on to the next steps, except when it is "deleted" or "identical".
Important: Both streams must be sorted on the specified key(s).
Options
Option |
Description |
---|---|
Step name |
Name of the step;this name has to be unique in a single transformation. |
Reference rows origin |
Specify the step origin for the reference rows <- Stream with original rows, or rows you want to compare the new rows to. |
Compare rows origin |
Specify the step origin for the compare rows.<- Stream with new rows |
Flag fieldname |
Specify the name of the flag flag field on the output stream. |
Keys to match |
Specify fields containing the keys on which to match;click Get key fields to insert all of the fields originating from the reference rows step |
Values to compare |
Specify fields contaning the values to compare; click Get value fields to insert all of the fields from the originating value rows step |