Transformation Executor
PLEASE NOTE: This documentation applies to an earlier version. For the most recent documentation, visit the Pentaho Enterprise Edition documentation site.
WORK IN PROGRESS
This topic needs further documentation to make it great. If you have experience with this transformation step, we encourage you to update this topic. More information can be found in JIRA case DOC-2111.
Description
The transformation executor allows you to execute a Pentaho Data Integration transformation. It is similar to the Job Executor step but works on transformations.
By default the specified transformation will be executed once for each input row. This row can be used to set parameters and variables and it is passed to the transformation in the form of a result row.
You can also allow a group of records to be passed based on the value in a field (when the value changes the transformation is executed) or on time. In these cases, the first row of the group or rows is used to set parameters or variables in the job.
It is possible to launch multiple copies of this step to facilitate parallel transformation processing.
Note: This step does not abort when the calling transformation errors out. To control the flow or abort of the transformation in case of errors, please specify the fields and a target step in the tab "Execution results" to get the number of errors. (fixed by PDI-12759 in PDI version 5.3).
Note: At the actual implementation, the log of the parent transformation contains only the last processed bunch of data. It was implemented this way to keep the strain on the logging back-end conservative. The detailed log of the child transformation can be obtained by looking at the execution results (define a target step within the Execution Result tabs) and look at the Fieldname of execution logging text (by default ExecutionLogText).
Options
Option | Description |
---|---|
Step name | Name of the step. Note: This name has to be unique in a single transformation. |
Transformation | Use this section to specify the transformation to execute. Â You have the following options to specify the transformation:
|
Parameter Options tab
In this tab you can specify which field to use to set a certain parameter or variable value. If multiple rows are passed to the job, the first row is taken to set the parameters or variables.
Option | Description |
---|---|
Variable / Parameter name | The Parameters tab allows you to define or pass Kettle variables down to the transformation. |
Field to use | Specify which field to use to set a certain parameter or variable value. If you specify an input field to use, the static input value is not used. |
Static input value | Instead of a field to use you can specify a static value here. |
If you enable the "Inherit all variables from the transformation" option, all the variables defined in the parent transformation are passed to the transformation.
There is a button in the lower right corner of the tab that will insert all the defined parameters of the specified transformation. For information the description of the parameter is inserted into the static input value field.
Row grouping Options tab
On this tab you can specify the amount of input rows that are passed to the transformation in the form of result rows. You can use the result rows in a Get rows from result step in a transformation.
Option | Description |
---|---|
The number of rows to send to the transformation | after every X rows the job will be executed and these X rows will be passed to the transformation |
Field to group rows on | Rows will be accumulated in a group as long as the field value stays the same. If the value |
The time to wait collecting rows before execution | This is time in Milliseconds the step will spend accumulating rows prior to the execution of the transformation. |
Result tabs
Please see the Job executor step - the usage is identical.
Example
WORK IN PROGRESS, please see an example on http://jira.pentaho.com/browse/PDI-12204 (with actual issues in 5.0.6)