Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Extraction of data from source systems

The complexity of extraction batch depends very much on the environment.

  • Is source system mission critical?
  • Can the source system sustain a long query?
  • Is source system located in local lan or cloud?
  • Is source system continuously being accessed?

There are 2 main scenarios...

  • Push - kettle batch located at source system and pushes data to ETL staging area
  • Pull - kettle batch located in ETL server pulling data into the ETL staging area

Pattern 1: Full extract with output truncate

The kettle script consist of an input step and an output step.

Output step is set to truncate table.

Pattern 2: Full extract with sql script

The kettle script consist of an input step and an output step plus a sql script that is not connected.

Pattern 3: Full extract

  • No labels