Transformation Definitions
The table below contains a list of transformation definitions:
Transformation |
Description |
---|---|
Value |
Values are part of a row and can contain any type of data: strings, floating point numbers, unlimited precision BigNumbers, integers, dates or boolean values |
Row |
A row exists of 0 or more values |
Input Stream |
A stack of rows that enters a step |
Hop |
A graphical representation of one or more data streams between two steps; a hop always represents the output stream for one step and the input stream for another — the number of streams is equal to the copies of the destination step (one or more) |
Note |
Descriptive text that can be added to a transformation |
Job Definitions
The table below contains a list of job definitions:
Job |
Description |
---|---|
Job Entry |
A part of a job that performs a specific task |
Hop |
A graphical representation of one or more data streams between two steps; a hop always represents the output stream for one step and the input stream for another — the number of streams is equal to the copies of the destination step (one or more) |
Note |
Descriptive text that that can be added to a job |
Toolbar
The icons on the toolbar of the main screen are from left to right:
Icon |
Description |
---|---|
|
Create a new job or transformation |
|
Open transformation/job from file if you're not connected to a repository or from the repository if you are connected to one. |
|
Save the transformation/job to a file or to the repository. |
|
Save the transformation/job under a different name or filename. |
|
Open the print dialog. |
|
Run transformation/job: runs the current transformation from XML file or repository. |
|
Preview transformation: runs the current transformation from memory. You can preview the rows that are produced by selected steps. |
|
Run the transformation in debug mode allowing you to troubleshoot execution errors. |
|
Replay the processing of a transformation for a certain date and time. This will cause certain steps (Text File Input and Excel Input) to only process rows that failed to be interpreted correctly during the run on that particular date and time. |
|
Verify transformation: Spoon runs a number of checks for every step to see if everything is going to run as it should. |
|
Run an impact analysis: what impact does the transformation have on the used databases. |
|
Generate the SQL that is needed to run the loaded transformation. |
|
Launches the database explorer allowing you to preview data, run SQL queries, generate DDL and more. |
Options
Kettle options allow you to customize properties associated with the behavior and look and feel of the graphical user interface. Examples include startup options such as whether or not to display tips and the Kettle Welcome Page, and user interface options such as fonts and colors. To access the options dialog, select Edit|Options... from the menu bar.
General Tab
Feature |
Description |
---|---|
Maximum Undo Level |
Sets the maximum number of steps that can be undone (or redone) by Spoon |
Default number of lines in preview dialog |
Allows you to change the default number of rows that are requested from a step during transformation previews |
Maximum nr of lines in the logging windows |
Specifies the maximum limit of rows to display in the logging window |
Show tips at startup? |
Sets the display of tips at startup |
Show welcome page at startup? |
Controls whether or not to display the Welcome page when launching Spoon |
Use database cache? |
Spoon caches information that is stored on the source and target databases. In some instances, caching causes incorrect results when you are making database changes. To prevent errors you can disable the cache altogether instead of clearing the cache every time. |
Note: Spoon automatically clears the database cache when you launch DDL (Data Definition Language) statements towards a database connection; however, when using third party tools, clearing the database cache manually may be necessary.
Feature |
Description |
---|---|
Open last file at startup? |
Loads the last transformation you used (opened or saved) from XML or repository automatically |
Auto save changed files? |
Automatically saves a changed transformation before running |
Only show the active file in the main tree? |
Reduces the number of transformation and job items in the main tree on the left by only showing the currently active file |
Only save used connections to XML? |
Limits the XML export of a transformation to the used connections in that transformation. This is helpful while exchanging sample transformations to avoid having all defined connections to be included. |
Ask about replacing existing connections on open/import? |
Requests permission before replacing existing database connections during import |
Replace existing connections on open/import? |
This is the action that's being taken when there is no dialog shown. (see previous option) (huh?) |
Show "Save" dialog? |
Allows you to turn off the confirmation dialogs you receive when a transformation has been changed |
Automatically split hops? |
Turns off the confirmation dialogs you get when you want to split a hop (see also 7.4. Splitting A Hop) |
Show "copy or distribute" dialog? |
Turns off the warning message that appears when you link a step to multiple outputs. This warning message describes the two options for handling multiple outputs:
|
Show repository dialog at startup? |
Controls whether or not the repositories dialog shows up at startup. |
Ask user when exiting? |
Controls whether or not to display the confirmation dialog when a user chooses to exit the application. |
Clear custom parameters (steps/plug-ins) |
Clears all parameters and flags that were set in the plug-in or step dialogs. |
Display tool tips? |
This option controls whether or not to display tool tips for the buttons on the main tool bar. |
Look and Feel tab
Note: Add introductory text here...
Feature |
Description |
---|---|
Fixed width font |
The font that is used in the dialog boxes, trees, input fields, and so on |
Font on workspace |
The font that is used on the graphical view |
Font for notes |
The font to use in notes that are displayed in the Graphical View |
Background color |
Sets the background color in Spoon. It affects all dialogs too |
Workspace background color |
Sets the background color in the Graphical View of Spoon |
Tab color |
The color that is being used to indicate tabs that are active/selected. |
Icon size in workspace |
Affects the size of the icons in the graph window. The original size of an icon is 32x32 pixels. The best results (graphically) are probably at sizes 16,24,32,48,64 and other multiples of 32. |
Line width on workspace |
Affects the line width of the hops on the Graphical View and the border around the step. |
Shadow size on workspace |
If this size is larger then 0, a shadow of the steps, hops, and notes is drawn on the canvas, making it look like the transformation floats above the canvas. |
Dialog middle percentage |
By default, a parameter is drawn at 35% of the width of the dialog, counted from the left. You can change this with this parameter. May be useful in cases where you use unusually large fonts. |
Canvas anti-aliasing? |
Some platforms like Windows, OSX and Linux support anti-aliasing through GDI, Carbon or Cairo. Check this to enable smoother lines and icons in your graph view. If you enable this and your environment doesn't work any more afterwards, change the value for option "EnableAntiAliasing" to "N" in file $HOME/.kettle/.spoonrc (C:\Documents and Settings\<user>\.kettle\.spoonrc on Windows) |
Use look of OS? |
Checking this on Windows allows you to use the default system settings for fonts and colors in Spoon. On other platforms, this is always the case. |
Show branding graphics |
Enabling this option will draw Pentaho Data Integration branding graphics on the canvas and in the left hand side "expand bar". |
Preferred Language |
Here you can specify the default language setting. If a certain text hasn't been translated into this locale, Kettle will fall back to the fail over locale. |
Alternative Language |
Because the original language in which Kettle was written is English, it's best to set this locale to English. |
Search Metadata
This option will search in any available fields, connectors or notes of all loaded jobs and transformations for the string specified in the Filter field. The metadata search returns a detailed result set showing the location of any search hits. This feature is accessed by choosing Edit|Search metadata from the menu bar.
Set environment variable
The Set Environment Variable feature allows you to create and set environment variables for the current user session explicitly. This is a useful feature when designing transformations for testing variable substitutions that are normally set dynamically by another job or transformation.
This feature is accessible by choosing Edit|Set Environment Variable from the menu bar.
Note: This page also displays when you run a transformation that use undefined variables. This allows you to define them right before execution time.
Show environment variables
This feature displays the current list of environment variables and their values. It is accessed by selecting the Edit|Show environment variables option from the menu bar.
Execution log history
If you have configured your Job or Transformation to store log information in a database table, you can view the log information from previous executions by right-clicking on the job or transformation in the Main Tree and selecting 'Open History View'. A view similar to the one below appears:
Note: todo: replace screenshot when PDI-224 is fixed
Note: The log history for a job or transformation also opens by default each next time you execute the file.
Replay
The Replay feature allows you to re-run a transformation that failed. Replay functionality is implemented for Text File Input and Excel input. It allows you to send files that had errors back to the source and have the data corrected. Only the lines that failed before processed during the replay if a .line file is present. The Replay feature uses the date in the filename of the .line file to match the entered replay date.
Generate mapping against target step
In cases where you have a fixed target table, map the fields from the stream to their corresponding fields in the target output table. Use a Select Values step in your transformation. The 'Generate mapping against target' option provides you with an easy-to-use dialog for defining these mappings that automatically creates the resulting Select Values step that can be dropped into your transformation flow prior to the table output step.
To access the 'Generate mapping against target' option right click in the table output step.
After defining your mappings, select OK and the Select Values step containing your mappings will appear on the workspace. Attach the mapping step into your transformation just before the table output step.
Generate mappings example
Below is an example of a simple transformation in which we want to generate mappings to our target output table:
- Begin by right-clicking on the Table output step and selecting 'Generate mappings against target'.
- Add all necessary mappings using the Generate Mapping dialog shown above and click OK. You will now see a Table output mapping step has been added to the canvas.
- Drag the generated Table output Mapping step into your transformation flow prior to the table output step:
Safe mode
Use safe mode when you are mixing rows from many sources to ensure that all rows have the same layout in all conditions. The safe mode option is available in the Spoon logging window or on the Execute a Transformation/Job window. When running in safe mode, the transformation checks every row that passes and ensures that all layouts are identical. If a row is does not have the same layout as the first row, an error is generated and reported.
Note: This option is also available in Pan.
Welcome Screen
The Welcome page displays the first time you launch Spoon 3.0; it provides you with links to additional information about Pentaho Data Integration. You can disable Welcome page in Spoon options (Edit|Options).