Teradata TPT Insert Upsert Bulk Loader

Description

This Teradata Parallel Transporter (TPT) step uses the TBuild command.  TPT supports Insert and Upsert through bulkloading. (Upsert is a combination of an Insert and Update action, which is also known as a "merge.")  This step emulates the traditional Teradata MultiLoad utility.

It is recommended that you use this step instead of the Teradata Fastload bulkloader.  TPT is a combination of the FastLoad, MultiLoad, FastExport, and TPump Teradata Load Utilities.  Because of this change, Teradata no longer develops new features for the other load utilities.  

Pentaho has released this step within PDI 5.1 GA.

Installation and Configuration

Installation: Install the Teradata Tools and Utilities on any machine that needs to execute the TPT loader. Make sure that the Teradata Parallel Transporter is installed and enter the location of TeraData client installation and paths in the "Tbuild Execution Environment Options."

Note: The bulk loader is supported and was tested successfully with Teradata Database and Teradata Parallel Transporter version 14.00.  For other supported versions, please check the Pentaho Infocenter.

Operating systems: The bulk loader only runs under operating systems that support named pipes (FIFO). This includes every modern operating system except Windows.

Authorization: Bulk loading occurs while your transformation runs under the logged in user. The bulk load command needs to have permissions to read from the FIFO file.

Please see the TPT Reference and User Guide for more installation and configuration details.  Teradata documentation can be found here: Teradata Information Products page.

Options

General Options

Option

Description

Step Name

Name of the step.

Connection

Indicates the connection to the Teradata database.

Job Script

Specifies whether to automatically generate a bulk loader script (also known as a control file) or use an existing one. Your selection determines which tabs are enabled.

Teradata Client Installation

Displays the path to the Teradata client installation.  Change the default to match the correct path for your environment if needed.

TWB_ROOT

Indicates the path to your Teradata installation.  Change the default to match the correct path for your environment if needed.

Teradata Libs

Specifies the path to the Teradata libs directory.  Change the default to match the correct path for your environment if needed.

TBuild Libs

Displays the path to the tbuild/libs directory.  Tbuild executes TPT commands.  Change the default to match the correct path for your environment if needed.

TDICU Libs

Indicates the path to the tdicu/libs directory.  TDICU is the Teradata internationalization component.  Change the default to match the correct path for your environment if needed.

COPLIB and COPERR

Specifies the path to COPLIB and COPERR.  COPLIB and COPERR are Call Level Interface (CLI) environment variables.  Change the default to match the correct path for your environment if necessary.

TBuild Executable

Displays the path to the TBuild executable.  Change the default to match the correct path for your environment if needed.

FIFO Base Filename

Specifies the temporary FIFO (First In First Out) file name for interprocess communication between the TPT tools and PDI.

Append Random Value to FIFO File Name

Ensures that each run of the TPT bulk loader produces a unique FIFO base file name. If this option is not selected and the TPT bulk loader runs in parallel, all runs use the same FIFO file. This can yield inconsistent results. Selecting this checkbox makes the FIFO unique for each run by appending a random value to the file name when it is physically written to the directory.

Job Name

Specifies the name that identifies the job.

Help

Provides help on the Teradata TPT Insert Upsert Bulk Loader window.

OK

Saves information in the window, then closes it.

Preview Script

Provides a preview of the script.

Cancel

Closes the window without saving information.

Job Script Options

Option

Description

Target Schema

Indicates the target schema in the database.

Target Table

Specifies the target table in the database.

Restart Log Table

Defines the table that stores restart information.

Work Table

Specifies the table that stores work information.

Error Table

Indicates the table that stores error information.

Uniqueness Violation Error Table

Displays the name of the table that stores uniqueness violation errors.

Drop Tables Before Load

Indicates whether Restart Log, Work, Error, and Uniqueness Violation Error tables should be dropped before load starts.  Make sure the user ID has drop permissions.

Update Rows Handling Options

Indicates the options available for row handling.  Choices are IGNORE DUPLICATE, INSERT FOR MISSING, and IGNORE MISSING.

Access Log File

Indicates the path and filename of the access log file.

Update Log File

Specifies the path and filename of the update log file.

Generated Script File Name

Specify the filename of the generated script file.  The generated script file is the control file that will be executed by a TBuild command.

Keys and Fields

Option

Description

Loading Method

Indicates the loading method.  Choose Insert or Upsert.

The key(s) to look up in the value(s)

When upserting, this defines the key(s) to lookup the values.

Fields to Load

Defines the fields to load and if they should be updated or not.

Get Fields

Loads fields into the Fields to load table.

Mapping

Displays the Enter Mapping window.

Job Script File

Option

Description

Control File

Indicates the path and filename of the control file if you want to execute an existing TPT control file.

Substitute Kettle Variables in Control File

Indicates that when the job script file is read, the TPT_FIFO_FILENAME variable is populated with the FIFO Base File Name indicated in the General tab. For this step to work properly, the ${TPT_FIFO_FILENAME} variable must be added to the job script file as shown in the example.

VARCHAR FileName = '$\{TPT_FIFO_FILENAME\}';

Note: Modify "FileName" in the example above to reflect the name of the file name variable in your script.

Tbuild Variable File

Indicates a local job variables file.

Enter Mapping

Option

Description

Source fields

Indicates unmapped fields that are from the source table.

Target fields

Describes the fields in the target table.

Mappings

Displays the mappings made between the Source and Target fields.

Add

Adds the Source-Target mapping to the Mappings panel.

Delete

Removes the Source-Target mapping from the Mappings panel.

Auto target selection

Indicates whether to automatically select the target.

Auto source selection

Indicates whether to automatically select the source.

Hide assigned source fields

Indicates whether to hide the already mapped source fields from view.

Hide assigned target fields

Indicates whether to hid the already mapped target fields from view.

OK

Saves mappings and closes the window.

Guess

Automatically generates mappings for you.

Cancel

Closes the window without saving the mappings.